My longest running in house production app is an audio transcriber. Very 
successful little gadget, running in xTalk since 2001

We have over 1,000 XML files from an audio archive of transcripts.

Now I'm digging in and getting the data out.

I'm not facile with xml routines but did my best with the help of Bernd new, 
actually useable, dictionary.

But ran into a bug  in 9 DP5  (I think… ) OR I am doing something wrong

given transcripts formatted with nodes like this:

<?xml version="1.0" encoding="UTF-8"?>
<audio_transcript>
<header>
  <audio_filename>CAS0886_radio-pilot_Inspired-Talks.A.mp3</audio_filename>
  <date_given>1980-01-03</date_given>
  <given_by>Gurudeva</given_by>
  <subject>Three Words of Existence</subject>
  <category>God and Lords of Dharma</category>
  <duration>18 min, 36 secs</duration>
  <given_location>San Francisco</given_location>
  <transcribed_by>Brahmanathaswami</transcribed_by>
  <description>
          Subtopic: three worlds: 0:3:56
          Subtopic: temple: 0:4:7
          </description>
</header>
          <transcript_text>
                   <p>
                             [Radio Announcer: Ravi Peruman introduces Gurudeva]
                   </p>
                   <p>
                             Gurudeva says ......
                   </p>
                   <p>
                             More content here
                   </p>
                   <p>
                             Subtopic: three worlds: 0:3:56
                   </p>
                   <p>
                             All about temple
                   </p>
                   <p>
                             Subtopic: temple: 0:4:7
                   </p>
          </transcript_text>
</audio_transcript>

My script looks like this

put revXMLChildContents(pTree, "/audio_transcript/header",tab,return,false,4) 
into fld "productionNotes"  # this works… I get all the contents
put revXMLNodeContents(pTree,"/audio_transcript/transcript_text/p") into tText 
# this works but we only get the first <p> content

# so I presume (like I said… parsing xml is new to me) we need to loop/iterate 
over the sibling <p> tags..
put revXMLNumberOfChildren(pTree,"/audio_transcript/transcript_text/","p",4) # 
return "6

# the following line should provide us what we need, I think, to set up a 
repeat loop  using the indexed node function
# and this is a) according to the dictionary b) and the script will compile:

put revXMLChildNames(pTree,"/audio_transcript/transcript_text/", 
return,"p",true)

I get a "green" OK in the script editor, but when I run it. we get this output, 
which is expected

p[1]
p[2]
p[3]
p[4]
p[5]
p[6]

and presumably I can use that list to now fetch the contents of all those nodes 
(haven't figured that out yet)

but the engine fires an error msg (even though the script compiled without 
complaining)  when we run it..

button "Load Transcript": execution error at line 22 (Handler: can't find 
handler) near "", char 89

it is breaking on the end of this line

put revXMLChildNames(pTree,"/audio_transcript/transcript_text/", 
return,"p",true)


even though the script compiles… isn't this a bug? If it a) is what the 
dictionary says it should be and b) compiles, why the error?

if not, what am I doing wrong?

The full button script is below… and you see my "fumbling" to fetch the content 
of all the "p" nodes. There seems to be some oddity relating to multiples nodes 
all having the same tag. 

global theTape
on mouseUp
put theTape into tTranscript
set the itemdel to "."
put "xml" into item -1 of tTranscript
if there is a file tTranscript then
put url ("file:/" & tTranscript) into tTranscriptXML
else
answer "Sorry, there is no transcript in the same folder as the audio" with "OK"
exit to top
end if
put revXMLCreateTree(tTranscriptXML,false, true,true) into pTree
if pTree is not an integer then
answer "Problem with the XML. Open in a text editor" with "OK"
end if
put revXMLChildContents(pTree, "/audio_transcript/header",tab,return,false,4) 
into fld "productionNotes"
put revXMLNodeContents(pTree,"/audio_transcript/transcript_text/p") into tText
put revXMLNumberOfChildren(pTree,"/audio_transcript/transcript_text/","p",4)
put revXMLChildNames(pTree,"/audio_transcript/transcript_text/", 
return,"p",true) 

#this script complies, but breaks on the above line when run
--put revXMLNextSibling(pTree,"/audio_transcript/transcript_text/p") into 
nextSibling
--put revXMLNodeContents(pTree,nextSibling) after tText # feeble attempt fails, 
need to do some loop but don't know how.
# no robust examples to follow, any help appreciated!
--put revXMLNodeContents(pTree, "audio_transcript/header/duration") into 
tTranscriptHTML # works for single node (of course)
--set the htmltext of fld "transcript" of stack "Audio_transcriber" to 
tTranscriptHTML

end mouseUp




_______________________________________________
use-livecode mailing list
use-livecode@lists.runrev.com
Please visit this url to subscribe, unsubscribe and manage your subscription 
preferences:
http://lists.runrev.com/mailman/listinfo/use-livecode

Reply via email to