I don't completely follow your description. I defined a handler that is a
subclass of XercesNamespace::HandlerBase and override these methods:
virtual void startElement
(
const XMLCh* const name
, AttributeList& attributes
) {
// use name as the element name.
myStack.push_back(... new stack item with name...);
// traverse attributes if you need to
}
virtual void characters
(
const XMLCh* const chars
, const XMLSize_t length
) {
myStack.back().elementText += chars;
}
virtual void endElement(const XMLCh* const name) {
// check that myStack.back().element == name
// enter { myStack.back().element, myStack.back().elementText } into map
myStack.pop_back();
}
john
-----Original Message-----
From: [email protected] [mailto:[email protected]]
Sent: Monday, March 15, 2010 8:13 AM
To: [email protected]
Subject: RE: Help with sax
Importance: High
Hello,
I'm following these steps: within SAXPrinterHandlers.hpp I inserted the
struct and then call the component type PathData in
SAXPrinterHandlers.cpp in the startElement method, store it in the name
of the element and if I understand it I elementText store the contents of, for
example with a file like this:
<root>
<course name="prova">
<reg_num> 10778th </ reg_num>
BIOL <subj> </ subj>
</ course>
<course name="prova1">
<reg_num> 10779th </ reg_num>
<subj> BIOS </ subj>
</ course>
</ root>
I have to create the elements PathData following streams:
1. element = root and elementText = null
2. element = course and elementText = null
3. element = reg_num and elementText = 10779th and so on ...
then place them in the vector. I understood what you wanted to tell me or I
misunderstood something? Of
course, the field element of the struct I fill in the startElement
method, but the field I fill in the method elementText characters? At this
point, however, I always wonder the characters method is used to read the
contents of the elements is not it? At what point should I repatriated field
elementText?
I have additional questions for you but I think that is enough for today .. one
step at a time arrivaremo the solution. Thanks ... ciaoooo
----Messaggio originale----
Da: [email protected]
Data: 9-mar-2010 2.15 PM
A: "[email protected]"<[email protected]>,
"[email protected]"<[email protected]>
Ogg: RE: Help with sax
Alessandra,
It sounds like you are a beginner to Xerces. In that case, I suggest that you
first study the examples like SaxPrint.
You'll need a SAXParser object and a DocumentHandler subclass, where you
implement at least startElement(), endElement(), and characters() methods.
The SAX model will call your virtual methods when elements are opened and
closed, and when element text is parsed. To do what you want (path-to-text
map), you should maintain a stack of element paths combined with current
element text in your parser subclass, something like:
struct PathData {
std::wstring element;
std::wstring elementText;
};
std::vector<PathData> pathdata;
On the startElement() method push an item onto the stack. On characters()
method, append to pathdata.back().elementText. On the endElement() method,
construct the current path by concatenating the whole stack of element values,
use that as key to insert into your map along with pathdata.back().elementText,
then pop the stack.
That's kind of a rough construct. You'll also need to deal with the attribute
list of each element, if you want that, by traversing the element list when you
get the startElement() call. There are many details to work through beyond
this.
Hope that helps,
john lilley
-----Original Message-----
From: [email protected] [mailto:[email protected]]
Sent: Monday, March 08, 2010 5:50 PM
To: [email protected]
Subject: Help with sax
Hello, I asked for help to your mailing list because I need to know some more
detailed information related to xerces c + +. The
problem is this: my thesis degree in computer science has as its
objective the construction of an algorithm for cleaning XML data. The
first step is precisely to build a algorimo can read an XML file and
generate XML from the various elements of the data containers for those
items classified according to their tag of belonging, I'll explain
better, suppose you have a file this:
<root>
<student>
<name> luca </ name>
</ student>
<student>
<name> anna </ name>
</ student>
</ root>
My
intention is to create a map that key to the path I want to use as a
key element such as root / student / name element and as a list
containing the names of all students.
I also need to use SAX to analyze because the files are large. I
wanted to know from you if there were specific methods for example take
the path of the various components and possibly some ideas on what to
do if it is not any method. Regards Alessandra. I hope that is clear ...