*Aim* I am trying to define a generic data structure for storing tree structures such as XML documents in the new nested arrays. I'd ike to use this structure to store XML documents, and to store the tree data structures I use for tree widgets. I'd like it to be simpler to use, understand and debug, than the XML external.
NB - I remember a reference to some scripts that took XML and created the new nested arrays. Anyone remember where it is - couldn't find it? This post is a work in progress - brain dump. I hope it's useful. It helps me structure my coding to write it up a bit first, and hopefully helps others with similar issues. *Key Words* XML, tree widgets, nested data structures, design patterns, nested arrays, regular epressions *The Problem* I'm working on a script to change one XML document into another - it is the sort of thing XSLT <http://www.w3.org/TR/xslt> isused for. So what would be a good way (design pattern) to convert one XML tree into another in Revolution? Ideally this structure would work for XML - but also more general tree structures where the node names could be complete lines of text (and not simply single word xml node names), such as you might have in an indented field. There are all sorts of uses for this - lets take a recent example I've had to do - translating rev htmltext to xHtml basic<http://www.w3.org/TR/xhtml-basic/>: which essentially involves taking elements like " As an example of the sort of hacks Id like to avoid - here is a function that I came up with for the xHtml basic <http://www.w3.org/TR/xhtml-basic/>use case: function html_RevToBoldSpan someHtml > put "(?miU)(<b>).*(</b>)" into someReg > -- put "(?mi)(<b>)[^\<]*(</b>)" into someReg > repeat > if matchchunk(someHtml, someReg, oTagStart, oTagEnd, cTagStart, > cTagEnd) is true then > put "</span>" into char cTagStart to cTagEnd of someHtml > put "<span style='font-weight:bold'>" into char oTagStart to > oTagEnd of someHtml > else > return someHtml > end if > end repeat > end html_RevToBoldSpan > Tip - for those of you that have delved into regular expressions - this script illustrates a new trick I've found with regular expressions - the use of "U" to force no-greedy matching - (?miU) at the beginning of a regexp causes the match to be multi-line (m), case insensitive (i) and non-greedy (U). This problem with scripts like this is that they cannot deal with true nested formatting tags - for that you need to walk the tag tree - which is what I'd like to do next: Here were my initial thoughts on a simple start based on renaming XML nodes: 1. Create an XML Tree for the original XML 2. Create a new one for the transformed XML 3. Write a recursive function to walk the tree - starting at the root node, getting its children and recursing 4. Have the recursive function make a call to a translate function which uses an array to store the new tag names as the contents of nested keys - this could be an array or use the xml treeID Somehow I need to include the general ability to use node attributes to determine the new node name - so that: <span style='font-weight:bold'> => <b> > <span style='color:#FF0000'>Red</span> => <font color="#FF0000"> > I think what I'd really like to do all this with arrays rather than XML treeIDs - that is: 1. Create an XML Tree for the original XML and convert it to an array 2. Write a recursive function to walk the array - starting at the root node, getting its children and recursing 3. Have the recursive function make a call to a translate function which uses an array to store the new tag names as the contents of nested keys - building a new transformed array as it goes 4. Create a new XML document from the transformed Array What I need to decide is: - what sort of structure to use for this generic array, and not so critically - how to implement some sort of "plugin" to this design pattern so it is easy do a variety of transformations easily and intuitively. *Nested Array Data Structure* For the array structure I want to use the new nested arrays, and also to be able to store attributes of nodes - something like: node_1 > node_1_1 > node_1_1_1 > node_1_2 > Puttin an XML tree like that above into a nested array we could then do things like: - put treeArray ["node_1"]["node_1_2"] into nodeContents - put treeArray ["node_1"]["node_1_2"]["attribute"]["style"] into nodeStyle With the attributes it gets ugly, and would require filtering out the "attribute" key, and naming it in some unique as possible way. So probably better to store a separate attribute branch of the array? - put treeArray ["_tree"]["node_1"]["node_1_2"] into nodeContents - put treeArray ["_attribute"]["node_1"]["node_1_2"]["style"] into nodeStyle *What about duplicate nodes?* This is where I get a bit stuck, as there are still problems: for instance at the moment the structure does not allow for duplicate nodes (which are very common): topNode > duplicateNode > nestedNode > duplicateNode > anotherNestedNode > The xml treeID notation uses references like: topnode/duplicateNode[2]/anotherNestedNode No idea what to do for that? Something like: - put treeArray ["tree"]["topNode"]["duplicateNode/2"]["anotherNestedNode"] into nodeContents ??? - put treeArray ["tree"]["topNode"][1]["duplicateNode"][2]["anotherNestedNode"][1] into nodeContents *What about callout functions?* I've used these before for plugin searches - but it may be more intuitive to simply copy and customise a handler for a specific purpose. On the other hand recursive functions are never that intuitive to customise - so call outs may be better? And not sure what a callout function could look like? on custom_TransformNode nodePath, orignialArray, @transFormed end custom_TransformNode _______________________________________________ use-revolution mailing list use-revolution@lists.runrev.com Please visit this url to subscribe, unsubscribe and manage your subscription preferences: http://lists.runrev.com/mailman/listinfo/use-revolution