Re: [Cdk-user] IUPAC name generation
I knew someone would say that :) Kind regards, Chris — Prof. Dr. Christoph Steinbeck Vice President for Digitalisation of the Friedrich-Schiller-University Jena Analytical Chemistry - Cheminformatics and Chemometrics Friedrich-Schiller-University Jena, Germany Phone Team Assistant: +49-3641-948171 http://cheminf.uni-jena.de http://orcid.org/-0001-6966-0814 What is man but that lofty spirit - that sense of enterprise. ... Kirk, "I, Mudd," stardate 4513.3.. > On 29. Jul 2023, at 15:59, Markus Sitzmann wrote: > > Hello, > > On the other hand, a chemical structure database with 100% naming correctness > - from where do I get one of these marvels - with or without tools and AI? ;-) > > Best, > Markus > > On Sat, Jul 29, 2023 at 12:32 PM Christoph Steinbeck > wrote: > Hi Yong, > > I wanted to say that STOUT only generates correct IUPAC names in 90% of the > cases. Like many deep-learning-based applications, it dreams up the rest. > Those wrong ones will usually still be relatively close to the correct ones, > so it depends on your application whether STOUT is helpful for you. > If, for example, you are a chemist (which you aren’t, I read this :), but > let’s assume you are) and you want to name a few compounds for a paper, then > STOUT is helpful, and you can correct its occasional mistakes and still have > a much easier life. > If, however, you want to name 10k compounds in your database, then you cannot > live with <100% naming success. > If you need 100%, you need to switch to an algorithmic tool like Marvin's > name generator, referenced in our paper. > I hope that makes sense. > > Kind regards, > > Chris > > — > Prof. Dr. Christoph Steinbeck > Vice President for Digitalisation of the Friedrich-Schiller-University Jena > > Analytical Chemistry - Cheminformatics and Chemometrics > Friedrich-Schiller-University Jena, Germany > Phone Team Assistant: +49-3641-948171 > http://cheminf.uni-jena.de > http://orcid.org/-0001-6966-0814 > > What is man but that lofty spirit - that sense of enterprise. > ... Kirk, "I, Mudd," stardate 4513.3.. > > > On 28. Jul 2023, at 17:36, Yong Gao wrote: > > > > Thank you, Chris. > > Tried a couple of smiles strings with STOUT. They seem to work. I’m not a > > chemist, but does non-deterministic suggest not unique? > > Yong From: Christoph Steinbeck > > Date: Friday, July 28, 2023 at 4:19 AM > > To: Yong Gao > > Cc: "cdk-user@lists.sourceforge.net" > > Subject: Re: [Cdk-user] IUPAC name generation > > [EXTERNAL SENDER] > > You can try our STOUT > > https://github.com/Kohulan/Smiles-TO-iUpac-Translator > > https://jcheminf.biomedcentral.com/articles/10.1186/s13321-021-00512-4 > > but be aware that this deep learning tool is not deterministic and makes > > occasional mistakes. > > If your structures are not too complicated, you can use Egon‘s tip of cross > > checking with OPSIN. > > If, of course, you want to name large amounts of structures in an > > unsupervised manner, you need a deterministic tool. > > Kind regards, > > Chris > > — > > Prof. Dr. Christoph Steinbeck > > Vice President for Digitalisation of the Friedrich-Schiller-University Jena > > > > Analytical Chemistry - Cheminformatics and Chemometrics > > Friedrich-Schiller-University Jena, Germany > > Phone Secretariat: +49-3641-948171 > > http://cheminf.uni-jena.de > > http://orcid.org/-0001-6966-0814 > > > > What is man but that lofty spirit - that sense of enterprise. > > ... Kirk, "I, Mudd," stardate 4513.3.. > > > > > > Am 27.07.2023 um 17:33 schrieb Yong Gao : > > Hi, > > Has anyone successfully generated IUPAC names from a smiles string? I see > > some code in the legacy module, but did not see a way to do it. Also, any > > suggestions for doing this with some other open source software? > > Thanks, > > Yong > > CONFIDENTIALITY NOTICE: This electronic mail transmission may contain > > privileged, confidential and/or sensitive information and is intended only > > for the review of the party to whom it is addressed and for the stated > > purpose. Unauthorized use or disclosure is strictly prohibited. If you > > have received this transmission in error, please notify > > eupriv...@blueprintmedicines.com if you're in the EU or > > priv...@blueprintmedicines.com for all other locations. Then immediately > > delete the transmission without reading its contents. > > ___ > > Cdk-user mailing list > >
Re: [Cdk-user] IUPAC name generation
Hello, On the other hand, a chemical structure database with 100% naming correctness - from where do I get one of these marvels - with or without tools and AI? ;-) Best, Markus On Sat, Jul 29, 2023 at 12:32 PM Christoph Steinbeck < christoph.steinb...@uni-jena.de> wrote: > Hi Yong, > > I wanted to say that STOUT only generates correct IUPAC names in 90% of > the cases. Like many deep-learning-based applications, it dreams up the > rest. Those wrong ones will usually still be relatively close to the > correct ones, so it depends on your application whether STOUT is helpful > for you. > If, for example, you are a chemist (which you aren’t, I read this :), but > let’s assume you are) and you want to name a few compounds for a paper, > then STOUT is helpful, and you can correct its occasional mistakes and > still have a much easier life. > If, however, you want to name 10k compounds in your database, then you > cannot live with <100% naming success. > If you need 100%, you need to switch to an algorithmic tool like Marvin's > name generator, referenced in our paper. > I hope that makes sense. > > Kind regards, > > Chris > > — > Prof. Dr. Christoph Steinbeck > Vice President for Digitalisation of the Friedrich-Schiller-University Jena > > Analytical Chemistry - Cheminformatics and Chemometrics > Friedrich-Schiller-University Jena, Germany > Phone Team Assistant: +49-3641-948171 > http://cheminf.uni-jena.de > http://orcid.org/-0001-6966-0814 > > What is man but that lofty spirit - that sense of enterprise. > ... Kirk, "I, Mudd," stardate 4513.3.. > > > On 28. Jul 2023, at 17:36, Yong Gao wrote: > > > > Thank you, Chris. > > Tried a couple of smiles strings with STOUT. They seem to work. I’m not > a chemist, but does non-deterministic suggest not unique? > > Yong From: Christoph Steinbeck > > Date: Friday, July 28, 2023 at 4:19 AM > > To: Yong Gao > > Cc: "cdk-user@lists.sourceforge.net" > > Subject: Re: [Cdk-user] IUPAC name generation > > [EXTERNAL SENDER] > > You can try our STOUT > > https://github.com/Kohulan/Smiles-TO-iUpac-Translator > > https://jcheminf.biomedcentral.com/articles/10.1186/s13321-021-00512-4 > > but be aware that this deep learning tool is not deterministic and makes > occasional mistakes. > > If your structures are not too complicated, you can use Egon‘s tip of > cross checking with OPSIN. > > If, of course, you want to name large amounts of structures in an > unsupervised manner, you need a deterministic tool. > > Kind regards, > > Chris > > — > > Prof. Dr. Christoph Steinbeck > > Vice President for Digitalisation of the Friedrich-Schiller-University > Jena > > > > Analytical Chemistry - Cheminformatics and Chemometrics > > Friedrich-Schiller-University Jena, Germany > > Phone Secretariat: +49-3641-948171 > > http://cheminf.uni-jena.de > > http://orcid.org/-0001-6966-0814 > > > > What is man but that lofty spirit - that sense of enterprise. > > ... Kirk, "I, Mudd," stardate 4513.3.. > > > > > > Am 27.07.2023 um 17:33 schrieb Yong Gao : > > Hi, > > Has anyone successfully generated IUPAC names from a smiles string? I > see some code in the legacy module, but did not see a way to do it. Also, > any suggestions for doing this with some other open source software? > > Thanks, > > Yong > > CONFIDENTIALITY NOTICE: This electronic mail transmission may contain > privileged, confidential and/or sensitive information and is intended only > for the review of the party to whom it is addressed and for the stated > purpose. Unauthorized use or disclosure is strictly prohibited. If you > have received this transmission in error, please notify > eupriv...@blueprintmedicines.com if you're in the EU or > priv...@blueprintmedicines.com for all other locations. Then immediately > delete the transmission without reading its contents. > > ___ > > Cdk-user mailing list > > Cdk-user@lists.sourceforge.net > > https://lists.sourceforge.net/lists/listinfo/cdk-user > > > > CONFIDENTIALITY NOTICE: This electronic mail transmission may contain > privileged, confidential and/or sensitive information and is intended only > for the review of the party to whom it is addressed and for the stated > purpose. Unauthorized use or disclosure is strictly prohibited. If you > have received this transmission in error, please notify > eupriv...@blueprintmedicines.com if you're in the EU or > priv...@blueprintmedicines.com for all other locations. Then immediately > delete the transmission without reading its contents. > > > > > ___ > Cdk-user mailing list > Cdk-user@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/cdk-user > ___ Cdk-user mailing list Cdk-user@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/cdk-user
Re: [Cdk-user] IUPAC name generation
Hi Yong, I wanted to say that STOUT only generates correct IUPAC names in 90% of the cases. Like many deep-learning-based applications, it dreams up the rest. Those wrong ones will usually still be relatively close to the correct ones, so it depends on your application whether STOUT is helpful for you. If, for example, you are a chemist (which you aren’t, I read this :), but let’s assume you are) and you want to name a few compounds for a paper, then STOUT is helpful, and you can correct its occasional mistakes and still have a much easier life. If, however, you want to name 10k compounds in your database, then you cannot live with <100% naming success. If you need 100%, you need to switch to an algorithmic tool like Marvin's name generator, referenced in our paper. I hope that makes sense. Kind regards, Chris — Prof. Dr. Christoph Steinbeck Vice President for Digitalisation of the Friedrich-Schiller-University Jena Analytical Chemistry - Cheminformatics and Chemometrics Friedrich-Schiller-University Jena, Germany Phone Team Assistant: +49-3641-948171 http://cheminf.uni-jena.de http://orcid.org/-0001-6966-0814 What is man but that lofty spirit - that sense of enterprise. ... Kirk, "I, Mudd," stardate 4513.3.. > On 28. Jul 2023, at 17:36, Yong Gao wrote: > > Thank you, Chris. > Tried a couple of smiles strings with STOUT. They seem to work. I’m not a > chemist, but does non-deterministic suggest not unique? > Yong From: Christoph Steinbeck > Date: Friday, July 28, 2023 at 4:19 AM > To: Yong Gao > Cc: "cdk-user@lists.sourceforge.net" > Subject: Re: [Cdk-user] IUPAC name generation > [EXTERNAL SENDER] > You can try our STOUT > https://github.com/Kohulan/Smiles-TO-iUpac-Translator > https://jcheminf.biomedcentral.com/articles/10.1186/s13321-021-00512-4 > but be aware that this deep learning tool is not deterministic and makes > occasional mistakes. > If your structures are not too complicated, you can use Egon‘s tip of cross > checking with OPSIN. > If, of course, you want to name large amounts of structures in an > unsupervised manner, you need a deterministic tool. > Kind regards, > Chris > — > Prof. Dr. Christoph Steinbeck > Vice President for Digitalisation of the Friedrich-Schiller-University Jena > > Analytical Chemistry - Cheminformatics and Chemometrics > Friedrich-Schiller-University Jena, Germany > Phone Secretariat: +49-3641-948171 > http://cheminf.uni-jena.de > http://orcid.org/-0001-6966-0814 > > What is man but that lofty spirit - that sense of enterprise. > ... Kirk, "I, Mudd," stardate 4513.3.. > > > Am 27.07.2023 um 17:33 schrieb Yong Gao : > Hi, > Has anyone successfully generated IUPAC names from a smiles string? I see > some code in the legacy module, but did not see a way to do it. Also, any > suggestions for doing this with some other open source software? > Thanks, > Yong > CONFIDENTIALITY NOTICE: This electronic mail transmission may contain > privileged, confidential and/or sensitive information and is intended only > for the review of the party to whom it is addressed and for the stated > purpose. Unauthorized use or disclosure is strictly prohibited. If you have > received this transmission in error, please notify > eupriv...@blueprintmedicines.com if you're in the EU or > priv...@blueprintmedicines.com for all other locations. Then immediately > delete the transmission without reading its contents. > ___ > Cdk-user mailing list > Cdk-user@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/cdk-user > > CONFIDENTIALITY NOTICE: This electronic mail transmission may contain > privileged, confidential and/or sensitive information and is intended only > for the review of the party to whom it is addressed and for the stated > purpose. Unauthorized use or disclosure is strictly prohibited. If you have > received this transmission in error, please notify > eupriv...@blueprintmedicines.com if you're in the EU or > priv...@blueprintmedicines.com for all other locations. Then immediately > delete the transmission without reading its contents. ___ Cdk-user mailing list Cdk-user@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/cdk-user
Re: [Cdk-user] IUPAC name generation
Thank you, Chris. Tried a couple of smiles strings with STOUT. They seem to work. I’m not a chemist, but does non-deterministic suggest not unique? Yong From: Christoph Steinbeck Date: Friday, July 28, 2023 at 4:19 AM To: Yong Gao Cc: "cdk-user@lists.sourceforge.net" Subject: Re: [Cdk-user] IUPAC name generation [EXTERNAL SENDER] You can try our STOUT https://github.com/Kohulan/Smiles-TO-iUpac-Translator https://jcheminf.biomedcentral.com/articles/10.1186/s13321-021-00512-4 but be aware that this deep learning tool is not deterministic and makes occasional mistakes. If your structures are not too complicated, you can use Egon‘s tip of cross checking with OPSIN. If, of course, you want to name large amounts of structures in an unsupervised manner, you need a deterministic tool. Kind regards, Chris — Prof. Dr. Christoph Steinbeck Vice President for Digitalisation of the Friedrich-Schiller-University Jena Analytical Chemistry - Cheminformatics and Chemometrics Friedrich-Schiller-University Jena, Germany Phone Secretariat: +49-3641-948171 http://cheminf.uni-jena.de http://orcid.org/-0001-6966-0814 What is man but that lofty spirit - that sense of enterprise. ... Kirk, "I, Mudd," stardate 4513.3.. Am 27.07.2023 um 17:33 schrieb Yong Gao : Hi, Has anyone successfully generated IUPAC names from a smiles string? I see some code in the legacy module, but did not see a way to do it. Also, any suggestions for doing this with some other open source software? Thanks, Yong CONFIDENTIALITY NOTICE: This electronic mail transmission may contain privileged, confidential and/or sensitive information and is intended only for the review of the party to whom it is addressed and for the stated purpose. Unauthorized use or disclosure is strictly prohibited. If you have received this transmission in error, please notify eupriv...@blueprintmedicines.com<mailto:eupriv...@blueprintmedicines.com> if you're in the EU or priv...@blueprintmedicines.com<mailto:priv...@blueprintmedicines.com> for all other locations. Then immediately delete the transmission without reading its contents. ___ Cdk-user mailing list Cdk-user@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/cdk-user CONFIDENTIALITY NOTICE: This electronic mail transmission may contain privileged, confidential and/or sensitive information and is intended only for the review of the party to whom it is addressed and for the stated purpose. Unauthorized use or disclosure is strictly prohibited. If you have received this transmission in error, please notify eupriv...@blueprintmedicines.com if you’re in the EU or priv...@blueprintmedicines.com for all other locations. Then immediately delete the transmission without reading its contents. ___ Cdk-user mailing list Cdk-user@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/cdk-user
Re: [Cdk-user] IUPAC name generation
You can try our STOUThttps://github.com/Kohulan/Smiles-TO-iUpac-Translatorhttps://jcheminf.biomedcentral.com/articles/10.1186/s13321-021-00512-4but be aware that this deep learning tool is not deterministic and makes occasional mistakes. If your structures are not too complicated, you can use Egon‘s tip of cross checking with OPSIN. If, of course, you want to name large amounts of structures in an unsupervised manner, you need a deterministic tool. Kind regards, Chris— Prof. Dr. Christoph SteinbeckVice President for Digitalisation of the Friedrich-Schiller-University JenaAnalytical Chemistry - Cheminformatics and ChemometricsFriedrich-Schiller-University Jena, GermanyPhone Secretariat: +49-3641-948171http://cheminf.uni-jena.dehttp://orcid.org/-0001-6966-0814What is man but that lofty spirit - that sense of enterprise Kirk, "I, Mudd," stardate 4513.3..Am 27.07.2023 um 17:33 schrieb Yong Gao : Hi, Has anyone successfully generated IUPAC names from a smiles string? I see some code in the legacy module, but did not see a way to do it. Also, any suggestions for doing this with some other open source software? Thanks, Yong CONFIDENTIALITY NOTICE: This electronic mail transmission may contain privileged, confidential and/or sensitive information and is intended only for the review of the party to whom it is addressed and for the stated purpose. Unauthorized use or disclosure is strictly prohibited. If you have received this transmission in error, please notify eupriv...@blueprintmedicines.com if you're in the EU or priv...@blueprintmedicines.com for all other locations. Then immediately delete the transmission without reading its contents. ___Cdk-user mailing listCdk-user@lists.sourceforge.nethttps://lists.sourceforge.net/lists/listinfo/cdk-user___ Cdk-user mailing list Cdk-user@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/cdk-user
Re: [Cdk-user] IUPAC name generation
The code in legacy was an academic project, but has limited functionality. There are nowadays a few open source tools that can predict the IUPAC name. By using OPSIN you can check if that prediction makes sense: OPSIN generated a chemical structure from the IUPAC name and if that is the same as the IUPAC that was generated, you have a good shot. See https://scholia.toolforge.org/topic/Q110408287 Egon On Thu, 27 Jul 2023 at 17:33, Yong Gao wrote: > Hi, > > > > Has anyone successfully generated IUPAC names from a smiles string? I see > some code in the legacy module, but did not see a way to do it. Also, any > suggestions for doing this with some other open source software? > > > > Thanks, > > Yong > > CONFIDENTIALITY NOTICE: This electronic mail transmission may contain > privileged, confidential and/or sensitive information and is intended only > for the review of the party to whom it is addressed and for the stated > purpose. Unauthorized use or disclosure is strictly prohibited. If you > have received this transmission in error, please notify > *eupriv...@blueprintmedicines.com > * if you're in the EU or > *priv...@blueprintmedicines.com > * for all other locations. Then > immediately delete the transmission without reading its contents. > ___ > Cdk-user mailing list > Cdk-user@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/cdk-user > -- Inherited disorders can be hard to interpret when multiple biomarkers are involved. A network approach can help bring insight: https://doi.org/10.1186/s13023-023-02683-9 -- E.L. Willighagen Department of Bioinformatics - BiGCaT Maastricht University (http://www.bigcat.unimaas.nl/) Blog: https://chem-bla-ics.blogspot.com/ Mastodon: https://scholar.social/@egonw PubList: https://orcid.org/-0001-7542-0286 ___ Cdk-user mailing list Cdk-user@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/cdk-user
[Cdk-user] IUPAC name generation
Hi, Has anyone successfully generated IUPAC names from a smiles string? I see some code in the legacy module, but did not see a way to do it. Also, any suggestions for doing this with some other open source software? Thanks, Yong CONFIDENTIALITY NOTICE: This electronic mail transmission may contain privileged, confidential and/or sensitive information and is intended only for the review of the party to whom it is addressed and for the stated purpose. Unauthorized use or disclosure is strictly prohibited. If you have received this transmission in error, please notify eupriv...@blueprintmedicines.com if you’re in the EU or priv...@blueprintmedicines.com for all other locations. Then immediately delete the transmission without reading its contents. ___ Cdk-user mailing list Cdk-user@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/cdk-user