Re: [R] Create a Data Frame from an XML

2013-01-24 Thread Adam Gabbert
Hi Ben,

Thanks for info. That is definitely a viable solution for the example I
provided.  It is more common that I have larger files with more edits to
make.

The main reason I went with a data frame methodology is becuase of the file
types I have.  Essentially, I have an XML 50+ "rows" by 10 "columns" of
data in a similar form to the provided example.  Seperately, I have an
.xlsx file that contains 3 to 6 columns of data that must replace a
particular "column" of the data in the XML file.  So my approach was
to have one XML data frame, one .xlsx data frame, combine them as
necessary, and output a final XML format with updated data. Apparently, it
wasn't quite as trivial of a problem as I was hoping.


On Wed, Jan 23, 2013 at 8:09 PM, Ben Tupper  wrote:

> Hi Adam,
>
>  On Jan 23, 2013, at 11:36 AM, Adam Gabbert wrote:
>
>  Hello Gentlemen,
>
> I mistakenly sent the message twice, because the first time I didn't
> receive a notification message so I was unsure if it went through properly.
>
> Your solutions worked great. Thank you!  I felt like I was fairly close
> just couldn't quite get the final step.
>
> Now, I'm trying to reverse the process and account for my header.
>
> In other words I have my data frame in R:
>
> BRANDNUMYEARVALUE
> GMC1  1999  1
> FORD   2  2000  12000
> GMC1  2001   12500
>  etc
> and I make some edits.
> BRANDNUMYEARVALUE
> DODGE   3  1999  1
> TOYOTA   4 2000  12000
> DODGE3  2001   12500
>
>
> You needn't transform to a data frame if all you need to do is tweak the
> values of some of the attributes.  You can always set the attributes of
> each row node directly.
>
>  s <- c("  ", "  VALUE=\"1\" />",
> " ",
> " ",
> " ",
> " ")
>
> x <- xmlRoot(xmlTreeParse(s, asText = TRUE, useInternalNodes = TRUE))
>
> node <- x["row"][[1]]
> node
> xmlAttrs(node) <- c(BRAND = "BUICK", NUM = "3", YEAR = "2000", VALUE = "0")
> node
> x
>
>
>
>  So now I would need to ouput an XML file in the same format accounting
> for my header (essentially, add "z:" in front of row).
>
>
>
> I think that what you're describing is a namespace identifier.  Check the
> XML package help for ?xmlNamespace  In particular check this example on the
> help page.
>
>   node <- xmlNode("arg", xmlNode("name", "foo"), namespace="R")
>   xmlNamespace(node)
>
>
>
> Cheers,
> Ben
>
>  (What I want to output)
> >   
> >   
> >   
> >   
> >   
> >   
> >   
> >   
> >   
> >   
> >   
> >   
> Thus far from the help I've found online I was trying to set up an xmlTree
> xml <- xmlTree()
>
> and use xml$addTag to create nodes and put in the data from my data
> frame.  I feel like I'm not really even close to a solution so I'm starting
> to believe that this might not be the best path to go down.
>
> Once again, any help is much appreciated.
>
> AG
>
>
> On Tue, Jan 22, 2013 at 6:04 PM, Duncan Temple Lang <
> dtemplel...@ucdavis.edu> wrote:
>
>>
>> Hi Adam
>>
>>  [You seem to have sent the same message twice to the mailing list.]
>>
>> There are various strategies/approaches to creating the data frame
>> from the XML.
>>
>> Perhaps the approach that most closely follows your approach is
>>
>>   xmlRoot(doc)[ "row" ]
>>
>> which  returns a list of XML nodes whose node name is "row" that are
>> children of the root node .
>>
>> So
>>   sapply(xmlRoot(doc) [ "row" ], xmlAttrs)
>>
>> yields a matrix with as many columns as there are   nodes
>> and with 3 rows - one for each of the BRAND, YEAR and VALUE attributes.
>>
>> So
>>
>>   d = t( sapply(xmlRoot(doc) [ "row" ], xmlAttrs) )
>>
>> gives you a matrix with the correct rows and column orientation
>> and now you can turn that into a data frame, converting the
>> columns into numbers, etc. as you want with regular R commands
>> (i.e. independently of the XML).
>>
>>
>>  D.
>>
>> On 1/22/13 1:43 PM, Adam Gabbert wrote:
>> >  Hello,
>> >
>> > I'm attempting to read information from an XML into a data frame in R
>> using
>> > the "XML" package. I am unable to get the data into a data frame as I
>> would
>> > like.  I have some sample code below.
>> >
>> > *XML Code:*
>> >
>> > Header...
>> >
>> > Data I want in a data frame:
>> >
>> >
>> >   
>> >   
>> >   
>> >   
>> >   
>> >   
>> >   
>> >   
>> >   
>> >   
>> >   
>> >
>> > *R Code:*
>> >
>> > doc< -xmlInternalTreeParse ("Sample2.xml")
>> > top <- xmlRoot (doc)
>> > xmlName (top)
>> > names (top)
>> > art <- top [["row"]]
>> > art
>> > **
>> > *Output:*
>> >
>> >> art
>> >
>> > * *
>> >
>> >
>> > This is where I am having difficulties.  I am unable to "access"
>> additional
>> > rows; ( i.e.   )
>> >
>> > and I am unable to access the individual entries to actually create the
>> > data frame.  The data frame I would like is as follows:
>> >
>> > BRANDNUMYEARVALUE
>> > GMC1  1999  1
>> > FORD   2  2000  12000
>> > G

Re: [R] Create a Data Frame from an XML

2013-01-24 Thread Franzini, Gabriele [Nervianoms]
Hello Adam,
I had a similar problem with a big dataframe, and building an xmlTree in
the clean way was extremely slow; so i resorted to manual method. Not
tested, but if your dataframe is my_df, then something like the
following should do:

buildEntry <- function(x) {
cat(paste('\n', sep=""))
}

sink(paste('my_file.xml', sep=""))

cat ('\n')
cat ('\n')
 
# invisible avoids returning a NULL in the file
invisible(apply(my_df, 1, buildEntry))
cat ("" )
sink()

And it took very little time.

HTH,
Gabriele


-Original Message-
From: Adam Gabbert [mailto:adamjgabb...@gmail.com] 
Sent: Wednesday, January 23, 2013 5:36 PM
To: Duncan Temple Lang; btup...@bigelow.org
Cc: r-help@r-project.org
Subject: Re: [R] Create a Data Frame from an XML

Hello Gentlemen,

I mistakenly sent the message twice, because the first time I didn't
receive a notification message so I was unsure if it went through
properly.

Your solutions worked great. Thank you!  I felt like I was fairly close
just couldn't quite get the final step.

Now, I'm trying to reverse the process and account for my header.

In other words I have my data frame in R:

BRANDNUMYEARVALUE
GMC1  1999  1
FORD   2  2000  12000
GMC1  2001   12500
 etc
and I make some edits.
BRANDNUMYEARVALUE
DODGE   3  1999  1
TOYOTA   4 2000  12000
DODGE3  2001   12500
So now I would need to ouput an XML file in the same format accounting
for my header (essentially, add "z:" in front of row).

(What I want to output)
>   
>   
>   
>   
>   
>   
>   
>   
>   
>   
>   
>   
Thus far from the help I've found online I was trying to set up an
xmlTree xml <- xmlTree()

and use xml$addTag to create nodes and put in the data from my data
frame.
I feel like I'm not really even close to a solution so I'm starting to
believe that this might not be the best path to go down.

Once again, any help is much appreciated.

AG


On Tue, Jan 22, 2013 at 6:04 PM, Duncan Temple Lang
 wrote:

>
> Hi Adam
>
>  [You seem to have sent the same message twice to the mailing list.]
>
> There are various strategies/approaches to creating the data frame 
> from the XML.
>
> Perhaps the approach that most closely follows your approach is
>
>   xmlRoot(doc)[ "row" ]
>
> which  returns a list of XML nodes whose node name is "row" that are 
> children of the root node .
>
> So
>   sapply(xmlRoot(doc) [ "row" ], xmlAttrs)
>
> yields a matrix with as many columns as there are   nodes and 
> with 3 rows - one for each of the BRAND, YEAR and VALUE attributes.
>
> So
>
>   d = t( sapply(xmlRoot(doc) [ "row" ], xmlAttrs) )
>
> gives you a matrix with the correct rows and column orientation and 
> now you can turn that into a data frame, converting the columns into 
> numbers, etc. as you want with regular R commands (i.e. independently 
> of the XML).
>
>
>  D.
>
> On 1/22/13 1:43 PM, Adam Gabbert wrote:
> >  Hello,
> >
> > I'm attempting to read information from an XML into a data frame in 
> > R
> using
> > the "XML" package. I am unable to get the data into a data frame as 
> > I
> would
> > like.  I have some sample code below.
> >
> > *XML Code:*
> >
> > Header...
> >
> > Data I want in a data frame:
> >
> >
> >   
> >   
> >   
> >   
> >   
> >   
> >   
> >   
> >   
> >   
> >   
> >
> > *R Code:*
> >
> > doc< -xmlInternalTreeParse ("Sample2.xml") top <- xmlRoot (doc) 
> > xmlName (top) names (top) art <- top [["row"]] art
> > **
> > *Output:*
> >
> >> art
> >
> > * *
> >
> >
> > This is where I am having difficulties.  I am unable to "access"
> additional
> > rows; ( i.e.   > /> )
> >
> > and I am unable to access the individual entries to actually create 
> > the data frame.  The data frame I would like is as follows:
> >
> > BRANDNUMYEARVALUE
> > GMC1  1999  1
> > FORD   2  2000  12000
> > GMC1  2001   12500
> > etc
> >
> > Any help or suggestions would be appreciated.  Conversly, my 
> > eventual
> goal
> > would be to take a data frame and write it into an XML in the 
> > previously shown format.
> >
> > Thank you
> >
> > AG
> >
&

Re: [R] Create a Data Frame from an XML

2013-01-23 Thread Ben Tupper
Hi Adam,

On Jan 23, 2013, at 11:36 AM, Adam Gabbert wrote:

> Hello Gentlemen,
>  
> I mistakenly sent the message twice, because the first time I didn't receive 
> a notification message so I was unsure if it went through properly.
>  
> Your solutions worked great. Thank you!  I felt like I was fairly close just 
> couldn't quite get the final step. 
>  
> Now, I'm trying to reverse the process and account for my header.
>  
> In other words I have my data frame in R:
>  
> BRANDNUMYEARVALUE
> GMC1  1999  1
> FORD   2  2000  12000
> GMC1  2001   12500
>  etc
> and I make some edits.
> BRANDNUMYEARVALUE
> DODGE   3  1999  1
> TOYOTA   4 2000  12000
> DODGE3  2001   12500

You needn't transform to a data frame if all you need to do is tweak the values 
of some of the attributes.  You can always set the attributes of each row node 
directly.

s <- c("  ", " ", 
" ", 
" ",  
" ", 
" ")

x <- xmlRoot(xmlTreeParse(s, asText = TRUE, useInternalNodes = TRUE))

node <- x["row"][[1]]
node
xmlAttrs(node) <- c(BRAND = "BUICK", NUM = "3", YEAR = "2000", VALUE = "0")
node
x



> So now I would need to ouput an XML file in the same format accounting for my 
> header (essentially, add "z:" in front of row). 
>  

I think that what you're describing is a namespace identifier.  Check the XML 
package help for ?xmlNamespace  In particular check this example on the help 
page.

  node <- xmlNode("arg", xmlNode("name", "foo"), namespace="R")
  xmlNamespace(node)


Cheers,
Ben

> (What I want to output)
> >   
> >   
> >   
> >   
> >   
> >   
> >   
> >   
> >   
> >   
> >   
> >   
> Thus far from the help I've found online I was trying to set up an xmlTree
> xml <- xmlTree()
>  
> and use xml$addTag to create nodes and put in the data from my data frame.  I 
> feel like I'm not really even close to a solution so I'm starting to believe 
> that this might not be the best path to go down.
>  
> Once again, any help is much appreciated.
>  
> AG
> 
>  
> On Tue, Jan 22, 2013 at 6:04 PM, Duncan Temple Lang  
> wrote:
> 
> Hi Adam
> 
>  [You seem to have sent the same message twice to the mailing list.]
> 
> There are various strategies/approaches to creating the data frame
> from the XML.
> 
> Perhaps the approach that most closely follows your approach is
> 
>   xmlRoot(doc)[ "row" ]
> 
> which  returns a list of XML nodes whose node name is "row" that are
> children of the root node .
> 
> So
>   sapply(xmlRoot(doc) [ "row" ], xmlAttrs)
> 
> yields a matrix with as many columns as there are   nodes
> and with 3 rows - one for each of the BRAND, YEAR and VALUE attributes.
> 
> So
> 
>   d = t( sapply(xmlRoot(doc) [ "row" ], xmlAttrs) )
> 
> gives you a matrix with the correct rows and column orientation
> and now you can turn that into a data frame, converting the
> columns into numbers, etc. as you want with regular R commands
> (i.e. independently of the XML).
> 
> 
>  D.
> 
> On 1/22/13 1:43 PM, Adam Gabbert wrote:
> >  Hello,
> >
> > I'm attempting to read information from an XML into a data frame in R using
> > the "XML" package. I am unable to get the data into a data frame as I would
> > like.  I have some sample code below.
> >
> > *XML Code:*
> >
> > Header...
> >
> > Data I want in a data frame:
> >
> >
> >   
> >   
> >   
> >   
> >   
> >   
> >   
> >   
> >   
> >   
> >   
> >
> > *R Code:*
> >
> > doc< -xmlInternalTreeParse ("Sample2.xml")
> > top <- xmlRoot (doc)
> > xmlName (top)
> > names (top)
> > art <- top [["row"]]
> > art
> > **
> > *Output:*
> >
> >> art
> >
> > * *
> >
> >
> > This is where I am having difficulties.  I am unable to "access" additional
> > rows; ( i.e.   )
> >
> > and I am unable to access the individual entries to actually create the
> > data frame.  The data frame I would like is as follows:
> >
> > BRANDNUMYEARVALUE
> > GMC1  1999  1
> > FORD   2  2000  12000
> > GMC1  2001   12500
> > etc
> >
> > Any help or suggestions would be appreciated.  Conversly, my eventual goal
> > would be to take a data frame and write it into an XML in the previously
> > shown format.
> >
> > Thank you
> >
> > AG
> >
> >   [[alternative HTML version deleted]]
> >
> > __
> > R-help@r-project.org mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
> >
> >
> 
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
> 

Ben Tupper
Bigelow Lab

Re: [R] Create a Data Frame from an XML

2013-01-23 Thread Adam Gabbert
Hello Gentlemen,

I mistakenly sent the message twice, because the first time I didn't
receive a notification message so I was unsure if it went through properly.

Your solutions worked great. Thank you!  I felt like I was fairly close
just couldn't quite get the final step.

Now, I'm trying to reverse the process and account for my header.

In other words I have my data frame in R:

BRANDNUMYEARVALUE
GMC1  1999  1
FORD   2  2000  12000
GMC1  2001   12500
 etc
and I make some edits.
BRANDNUMYEARVALUE
DODGE   3  1999  1
TOYOTA   4 2000  12000
DODGE3  2001   12500
So now I would need to ouput an XML file in the same format accounting for
my header (essentially, add "z:" in front of row).

(What I want to output)
>   
>   
>   
>   
>   
>   
>   
>   
>   
>   
>   
>   
Thus far from the help I've found online I was trying to set up an xmlTree
xml <- xmlTree()

and use xml$addTag to create nodes and put in the data from my data frame.
I feel like I'm not really even close to a solution so I'm starting to
believe that this might not be the best path to go down.

Once again, any help is much appreciated.

AG


On Tue, Jan 22, 2013 at 6:04 PM, Duncan Temple Lang  wrote:

>
> Hi Adam
>
>  [You seem to have sent the same message twice to the mailing list.]
>
> There are various strategies/approaches to creating the data frame
> from the XML.
>
> Perhaps the approach that most closely follows your approach is
>
>   xmlRoot(doc)[ "row" ]
>
> which  returns a list of XML nodes whose node name is "row" that are
> children of the root node .
>
> So
>   sapply(xmlRoot(doc) [ "row" ], xmlAttrs)
>
> yields a matrix with as many columns as there are   nodes
> and with 3 rows - one for each of the BRAND, YEAR and VALUE attributes.
>
> So
>
>   d = t( sapply(xmlRoot(doc) [ "row" ], xmlAttrs) )
>
> gives you a matrix with the correct rows and column orientation
> and now you can turn that into a data frame, converting the
> columns into numbers, etc. as you want with regular R commands
> (i.e. independently of the XML).
>
>
>  D.
>
> On 1/22/13 1:43 PM, Adam Gabbert wrote:
> >  Hello,
> >
> > I'm attempting to read information from an XML into a data frame in R
> using
> > the "XML" package. I am unable to get the data into a data frame as I
> would
> > like.  I have some sample code below.
> >
> > *XML Code:*
> >
> > Header...
> >
> > Data I want in a data frame:
> >
> >
> >   
> >   
> >   
> >   
> >   
> >   
> >   
> >   
> >   
> >   
> >   
> >
> > *R Code:*
> >
> > doc< -xmlInternalTreeParse ("Sample2.xml")
> > top <- xmlRoot (doc)
> > xmlName (top)
> > names (top)
> > art <- top [["row"]]
> > art
> > **
> > *Output:*
> >
> >> art
> >
> > * *
> >
> >
> > This is where I am having difficulties.  I am unable to "access"
> additional
> > rows; ( i.e.   )
> >
> > and I am unable to access the individual entries to actually create the
> > data frame.  The data frame I would like is as follows:
> >
> > BRANDNUMYEARVALUE
> > GMC1  1999  1
> > FORD   2  2000  12000
> > GMC1  2001   12500
> > etc
> >
> > Any help or suggestions would be appreciated.  Conversly, my eventual
> goal
> > would be to take a data frame and write it into an XML in the previously
> > shown format.
> >
> > Thank you
> >
> > AG
> >
> >   [[alternative HTML version deleted]]
> >
> > __
> > R-help@r-project.org mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
> >
> >
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Create a Data Frame from an XML

2013-01-22 Thread Duncan Temple Lang

Hi Adam

 [You seem to have sent the same message twice to the mailing list.]

There are various strategies/approaches to creating the data frame
from the XML.

Perhaps the approach that most closely follows your approach is

  xmlRoot(doc)[ "row" ]

which  returns a list of XML nodes whose node name is "row" that are
children of the root node .

So
  sapply(xmlRoot(doc) [ "row" ], xmlAttrs)

yields a matrix with as many columns as there are   nodes
and with 3 rows - one for each of the BRAND, YEAR and VALUE attributes.

So

  d = t( sapply(xmlRoot(doc) [ "row" ], xmlAttrs) )

gives you a matrix with the correct rows and column orientation
and now you can turn that into a data frame, converting the
columns into numbers, etc. as you want with regular R commands
(i.e. independently of the XML).


 D.

On 1/22/13 1:43 PM, Adam Gabbert wrote:
>  Hello,
> 
> I'm attempting to read information from an XML into a data frame in R using
> the "XML" package. I am unable to get the data into a data frame as I would
> like.  I have some sample code below.
> 
> *XML Code:*
> 
> Header...
> 
> Data I want in a data frame:
> 
>
>   
>   
>   
>   
>   
>   
>   
>   
>   
>   
>   
> 
> *R Code:*
> 
> doc< -xmlInternalTreeParse ("Sample2.xml")
> top <- xmlRoot (doc)
> xmlName (top)
> names (top)
> art <- top [["row"]]
> art
> **
> *Output:*
> 
>> art
> 
> * *
> 
> 
> This is where I am having difficulties.  I am unable to "access" additional
> rows; ( i.e.   )
> 
> and I am unable to access the individual entries to actually create the
> data frame.  The data frame I would like is as follows:
> 
> BRANDNUMYEARVALUE
> GMC1  1999  1
> FORD   2  2000  12000
> GMC1  2001   12500
> etc
> 
> Any help or suggestions would be appreciated.  Conversly, my eventual goal
> would be to take a data frame and write it into an XML in the previously
> shown format.
> 
> Thank you
> 
> AG
> 
>   [[alternative HTML version deleted]]
> 
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
> 
>

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Create a Data Frame from an XML

2013-01-22 Thread Adam Gabbert
 Hello,

I'm attempting to read information from an XML into a data frame in R using
the "XML" package. I am unable to get the data into a data frame as I would
like.  I have some sample code below.

*XML Code:*

Header...

Data I want in a data frame:

   
  
  
  
  
  
  
  
  
  
  
  

*R Code:*

doc< -xmlInternalTreeParse ("Sample2.xml")
top <- xmlRoot (doc)
xmlName (top)
names (top)
art <- top [["row"]]
art
**
*Output:*

> art

* *


This is where I am having difficulties.  I am unable to "access" additional
rows; ( i.e.   )

and I am unable to access the individual entries to actually create the
data frame.  The data frame I would like is as follows:

BRANDNUMYEARVALUE
GMC1  1999  1
FORD   2  2000  12000
GMC1  2001   12500
etc

Any help or suggestions would be appreciated.  Conversly, my eventual goal
would be to take a data frame and write it into an XML in the previously
shown format.

Thank you

AG

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.