Good stuff Peter, I guess for some projects that we've worked on, doing the character frequency analysis on the OSIS files is doing it at the last stage in the process before module build.
For projects that begin at USFM (or earlier), it would be great to develop a tool that analyses character frequency of the text (for the whole Bible) apart from all the USFM tags, etc. One simple way to do this would be to have a script that does the following: (a) merges all the USFM files into a single text file (b) removes all the USFM tags (& the English stuff such as IDs & text in remarks, etc) (c) does the character frequency counting For my part, (a) & (b) could easily be done by means of a TextPipe filter. David -- View this message in context: http://sword-dev.350566.n4.nabble.com/Character-Frequency-tp3642222p3653469.html Sent from the SWORD Dev mailing list archive at Nabble.com. _______________________________________________ sword-devel mailing list: sword-devel@crosswire.org http://www.crosswire.org/mailman/listinfo/sword-devel Instructions to unsubscribe/change your settings at above page