Hi Brigid.

Here are a few commands that should do what you want:

bri = xmlParse("myDataFile.xml")

tmp =  t(xmlSApply(xmlRoot(bri), xmlAttrs))[, -1]
dd = as.data.frame(tmp, stringsAsFactors = FALSE,
                    row.names = 1:nrow(tmp))

And then you can convert the columns to whatever types you want
using regular R commands.

The basic idea is that for each of the child nodes of C,
i.e. the <T>'s, we want the character vector of attributes
which we can get with xmlAttrs().

Then we stack them together into a matrix, drop the "N"
and then convert the result to a data frame, avoiding
duplicate row names which are all "T".

(BTW, make certain the '-' on the second line is not in the XML content.
 I assume that came from bringing the text into mail.)

HTH
  D.


Brigid Mooney wrote:
Hi,

I am trying to parse XML files and read them into R as a data frame,
but have been unable to find examples which I could apply
successfully.

I'm afraid I don't know much about XML, which makes this all the more
difficult.  If someone could point me in the right direction to a
resource (preferably with an example or two), it would be greatly
appreciated.

Here is a snippet from one of the XML files that I am looking to read,
and I am aiming to be able to get it into a data frame with columns N,
T, A, B, C as in the 2nd level of the heirarchy.

  <?xml version="1.0" encoding="utf-8" ?>
- <C S="UnitA" D="1/3/2007" C="24745" F="24648">
  <T N="1" T="9:30:13 AM" A="30.05" B="29.85" C="30.05" />
  <T N="2" T="9:31:05 AM" A="29.89" B="29.78" C="30.05" />
  <T N="3" T="9:31:05 AM" A="29.9" B="29.86" C="29.87" />
  <T N="4" T="9:31:05 AM" A="29.86" B="29.86" C="29.87" />
  <T N="5" T="9:31:05 AM" A="29.89" B="29.86" C="29.87" />
  <T N="6" T="9:31:06 AM" A="29.89" B="29.85" C="29.86" />
  <T N="7" T="9:31:06 AM" A="29.89" B="29.85" C="29.86" />
  <T N="8" T="9:31:06 AM" A="29.89" B="29.85" C="29.86" />
</C>

Thanks for any help or direction anyone can provide.

As a point of reference, I am using R 2.8.1 and have loaded the XML package.

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to