Hey Peter, I am also interested in this work. I haven't tried ChemBERT. I should give some a shot and do a little comparison. Would be a good lecture too. I was using molminer a little while ago built on ORSA.
https://github.com/gorgitko/molminer and I think I took a divergent path from automated tooling for now. I'm working on this mapping for Cannabis Sativa I haven't figured out how to map the relationship to the phenology perhaps by country of origin? functional group? I did it manually. I use it as a reference index here: https://github.com/Sulstice/global-chem/blob/development/global_chem/global_chem/medicinal_chemistry/cannabinoids/constituents_of_cannabis_sativa.py I was thinking I could use this list as a master name as indexes in searching other papers. Let me know any thoughts. Cheers, -Sul On Thu, Jan 19, 2023 at 6:26 AM Peter Murray-Rust <pm...@cam.ac.uk> wrote: > What are the current Open Source tools for recognising chemical entities > in text? OSCAR still runs but is probably somewhat overtaken by more > recent language models. I see that HuggingFace has "ChemBERT" - does anyone > have experience? > > More generally we want to extract triples of the form: > <chemical> <relationship> <plant> > We plan to do chemicals and plants and then look for relationships. But > maybe people have already done this. > > TIA > > P. > > -- > "I always retain copyright in my papers, and nothing in any contract I > sign with any publisher will override that fact. You should do the same". > > Peter Murray-Rust > Reader Emeritus in Molecular Informatics > Yusuf Hamied Department of Chemistry > University of Cambridge > CB2 1EW, UK > +44-1223-336432 > _______________________________________________ > Blueobelisk-discuss mailing list > Blueobelisk-discuss@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/blueobelisk-discuss > -- *Suliman Sharif* Ph.D. Candidate Pharmaceutical Sciences | University of Maryland, School of Pharmacy M.Sc Medicinal Chemistry | University of California, Riverside School of Medicine B.Sc. Biochemistry | University of Texas at Austin sharifsulim...@gmail.com
_______________________________________________ Blueobelisk-discuss mailing list Blueobelisk-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/blueobelisk-discuss