Hey,
Thanks for the answers. I was using the StaX parser (not XOM) and reading
the value of a tag as follows:
(snippet)
event = eventReader.nextEvent();
String smarts = event.asCharacters().getData();
The SMARTS string would not contain the entire SMARTS string and be cut-off
after [#1] sequence, probably due to the .asCharacters() method.
There are probably safer ways of reading in the data.
Nina has already provided with some examples.
Regards,
nick
===========
public void readConfig(String configFile) {
// First create a new XMLInputFactory
XMLInputFactory inputFactory =
XMLInputFactory.newInstance();
// Setup a new eventReader
InputStream in;
try {
in = new FileInputStream(configFile);
XMLEventReader eventReader =
inputFactory.createXMLEventReader(in);
// Read the XML document
while (eventReader.hasNext()) {
XMLEvent event = eventReader.nextEvent();
if (event.isStartElement()) {
StartElement startElement =
event.asStartElement();
// If we have a item element we
create a new item
if
(startElement.getName().getLocalPart() == (SMARTS)) {
event =
eventReader.nextEvent();
String smarts =
event.asCharacters().getData();
System.out.println(smarts);
}
}
}
} catch (FileNotFoundException e) {
} catch (XMLStreamException e) {
}
}
==============
Nick
-----Original Message-----
From: Rajarshi Guha [mailto:[email protected]]
Sent: Friday, February 03, 2012 7:39 PM
To: Nick Vandewiele
Cc: [email protected]
Subject: Re: [Cdk-user] SMARTS [#1] in xml parsing?
How are you parsing the document? Using XOM the following snippet works fine
String xml = "<?xml version=\"1.0\" encoding=\"ISO-8859-1\"?>\n" +
"<note>\n" +
"\t<to>C[#1]NC</to>\n" +
"</note>";
Builder parser = new Builder();
Document x = parser.build(new StringReader(xml));
Nodes nodes = x.query("//to");
System.out.println("nodes.get(0).getValue() = " +
nodes.get(0).getValue());
On Fri, Feb 3, 2012 at 12:46 PM, Nick Vandewiele <[email protected]>
wrote:
> Hey,
>
>
>
> I created a xml structured file containing SMARTS strings.
>
>
>
> When I try to parse this xml file, SMARTS strings containing the
> sequence [#1] are misread by the xml parser, more specifically: the
> remaining characters of the string are not read in anymore. This is
> probably due to an illegal xml character sequence.
>
>
>
> Since [#1] represents a hydrogen atom in SMARTS, I really need this
> character sequence.
>
>
>
> Since you probably have experience in xml parsing with CDK, do you
> know a way to avoid the xml misinterpretation? Or could you maybe
> point me out to a CDK xml parser that handles SMARTS strings? Or
> maybe, there is an alternative in representing a hydrogen in SMARTS?
>
>
>
> Regards,
>
>
>
> Nick
>
>
>
>
> ----------------------------------------------------------------------
> -------- Try before you buy = See our experts in action!
> The most comprehensive online learning library for Microsoft
> developers is just $99.99! Visual Studio, SharePoint, SQL - plus
> HTML5, CSS3, MVC3, Metro Style Apps, more. Free future releases when you
subscribe now!
> http://p.sf.net/sfu/learndevnow-dev2
> _______________________________________________
> Cdk-user mailing list
> [email protected]
> https://lists.sourceforge.net/lists/listinfo/cdk-user
>
--
Rajarshi Guha
NIH Chemical Genomics Center
------------------------------------------------------------------------------
Try before you buy = See our experts in action!
The most comprehensive online learning library for Microsoft developers
is just $99.99! Visual Studio, SharePoint, SQL - plus HTML5, CSS3, MVC3,
Metro Style Apps, more. Free future releases when you subscribe now!
http://p.sf.net/sfu/learndevnow-dev2
_______________________________________________
Cdk-user mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/cdk-user