The inclusion of Line 4 in molfile V3000 serves as a measure for backward compatibility (as mentioned in https://en.wikipedia.org/wiki/Chemical_table_file). In V2000, the 4th line was utilized to convey information about the number of atoms, bonds, chirality, etc. However, in a technical sense, this line was meant to be read and ignored by CDK. Surprisingly, in CDK version 2.9, there is an attempt to parse it, even though it is unnecessary.
Thanks Velusamy K. Velu 614-323-9649 <https://peruselab.com/> <https://www.linkedin.com/in/vkvelu/> <https://twitter.com/PeruseLab> <https://www.facebook.com/PeruseLab/> On Mon, Jan 15, 2024 at 5:40 AM Tim Dudgeon <tdudgeon...@gmail.com> wrote: > According to the CTFile spec there should be two spaces before the first 0. > https://www.daylight.com/meetings/mug05/Kappler/ctfile.pdf > So it's strange that the latest version has that behaviour. > > BTW the 1.4 versions are available from Maven from > https://nexus.ideaconsult.net/content/repositories/thirdparty > > I tried using 1.4 but that failed as SmartCyp uses classes such > as org.openscience.cdk.interfaces.IMolecule that disappeared in 1.5. > > > > On Fri, Jan 12, 2024 at 9:11 PM Velusamy Velu <vv...@peruselab.com> wrote: > >> I tried to use cdk version 1.42 | 1.4.19 etc but no luck downloading >> them. So, I used version 2.9 (the latest I guess). Yet, there was an error. >> >> After that I inspected your SDF and found it to have an extra space in >> the 4th line. >> Originally: >> >> 123456789012345678901234567890123456789012345678901234567890 >> 0 0 0 0 0 0 V3000 >> >> After correction: >> >> 123456789012345678901234567890123456789012345678901234567890 >> 0 0 0 0 0 0 V3000 >> >> After removing that extra space it worked fine with CDK 2.9. I have >> attached the image created from the parsed molecule. >> >> Sorry, this may not be the answer you need but probably can help you yet. >> >> Thanks >> >> Velusamy K. Velu >> (614) 323-9649 >> <https://peruselab.com> <https://www.linkedin.com/company/peruselab/> >> <https://twitter.com/PeruseLab> <https://www.facebook.com/PeruseLab/> >> >> >> On Thu, Jan 11, 2024 at 11:14 AM Youyi Peng via Cdk-user < >> cdk-user@lists.sourceforge.net> wrote: >> >>> Hi, >>> >>> Could you please remove my name from the email list? >>> >>> Thank you, >>> >>> >>> >>> Youyi >>> >>> >>> >>> *From: *John Mayfield <john.wilkinson...@gmail.com> >>> *Date: *Thursday, January 11, 2024 at 5:14 AM >>> *To: *Tim Dudgeon <tdudgeon...@gmail.com> >>> *Cc: *CDK users list <cdk-user@lists.sourceforge.net> >>> *Subject: *Re: [Cdk-user] Reading V3000 SDF >>> >>> So I think it's here: >>> https://github.com/cdk/cdk/blob/cdk-1.4.19/src/main/org/openscience/cdk/graph/invariant/EquivalentClassPartitioner.java#L398 >>> >>> >>> >>> Which is deep in the algorithm rather than in the reader. Can you also >>> send the V2000 that you say works and I'll see if there is anything obvious >>> you can do to make the V3000 work. >>> >>> >>> >>> Testing the current version I don't see the error, the only significant >>> change was in 2013: >>> https://github.com/cdk/cdk/commit/0aa0b794f48cdc057db133eabbcd865775a0730b >>> >>> >>> >>> On Thu, 11 Jan 2024 at 09:53, Tim Dudgeon <tdudgeon...@gmail.com> wrote: >>> >>> skip=true doesn't help. >>> >>> >>> >>> On Thu, Jan 11, 2024 at 8:56 AM John Mayfield < >>> john.wilkinson...@gmail.com> wrote: >>> >>> Hi Tim, >>> >>> >>> >>> Why are you forced to use 1.4? I remember I made lots of improvements to >>> the SDF reading over a decade ago (1.4 is now 10.5 years old) but these >>> would have been in 1.5 onwards. It doesn't look like you're doing anything >>> wrong but you could try adding skip=true to your constructor. This means if >>> it sees something it doesn't like it continues rather than stops iterating. >>> >>> >>> >>> Best, >>> >>> John >>> >>> >>> >>> On Wed, 10 Jan 2024 at 16:31, Tim Dudgeon <tdudgeon...@gmail.com> wrote: >>> >>> I'm having difficulty reading V3000 SDF files. >>> >>> The IteratingMDLReader docs ( >>> https://cdk.github.io/cdk/1.4/docs/api/org/openscience/cdk/io/iterator/IteratingMDLReader.html) >>> seem to suggest that it will read V3000, but maybe it has to be >>> specifically told to use V3000 format (which would be a pain to work out)? >>> >>> >>> >>> I'm using it like this: >>> >>> >>> >>> File sdfFile = new File(file); >>> IteratingMDLReader reader = new IteratingMDLReader( >>> new FileInputStream(sdfFile), >>> >>> DefaultChemObjectBuilder.getInstance() >>> ); >>> >>> >>> >>> BTW, I'm forced into using an old 1.4 version for reasons out of my >>> control. >>> >>> _______________________________________________ >>> Cdk-user mailing list >>> Cdk-user@lists.sourceforge.net >>> https://lists.sourceforge.net/lists/listinfo/cdk-user >>> >>> _______________________________________________ >>> Cdk-user mailing list >>> Cdk-user@lists.sourceforge.net >>> https://lists.sourceforge.net/lists/listinfo/cdk-user >>> >> _______________________________________________ >> Cdk-user mailing list >> Cdk-user@lists.sourceforge.net >> https://lists.sourceforge.net/lists/listinfo/cdk-user >> > _______________________________________________ > Cdk-user mailing list > Cdk-user@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/cdk-user >
_______________________________________________ Cdk-user mailing list Cdk-user@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/cdk-user