Hi!
I have a program that parses text files with the following format:
(.)
ADCHN 1
AG1 42.06 deg
AG2 90.00 deg
AG3 90.00 deg
CLFRQ 100.0 Hz
CLMOD 0
CLPNT 256
CLRSO 0.4 Hz
(.)
[The complete file is in http://www.mestrec.com/good.txt]
However, I have found cases where the text file is somewhat corrupted. For
instance, the text file becomes something like this:
(...) ��AG1 22.50 deg��AG2 0.00 deg��AG3 0.00 deg��CLFRQ
5000.0 Hz ��CLMOD 0 ��CLPNT 128 ��CLRSO 39.1 Hz (...)
[The complete file is in http://www.mestrec.com/bad.txt Note: the file must
be opened with WordPad not with Notepad]
In this case, the text file was transferred from the PDP-11 with Kermit
(binary).
Any idea about how can this format be parsed?
Currently, .I'm using the following function to parse a given label in the
text file:
CString ReadParam(FILE *fp, CString &str)
{
rewind(fp);
char sz[2048];
while (fgets(sz,256,fp) !=NULL)
{
StringTokenizer token(sz);
if(token.size())
{
CString s(token[0].c_str());
if (str.CompareNoCase(s) == 0)
{
return token[1].c_str();
}
}
}
return "";
}
where StringTokenizer is defined as follows:
#ifndef TOKENIZER_H
#define TOKENIZER_H
#include <string>
#include <vector>
using namespace std;
class StringTokenizer : public vector<string>
{
public:
StringTokenizer(const string &rStr, const string &rDelimiters = " ,\n=");
};
inline StringTokenizer::StringTokenizer(const string &rStr, const string
&rDelimiters)
{
string::size_type lastPos(rStr.find_first_not_of(rDelimiters, 0));
string::size_type pos(rStr.find_first_of(rDelimiters, lastPos));
while (string::npos != pos || string::npos != lastPos)
{
push_back(rStr.substr(lastPos, pos - lastPos));
lastPos = rStr.find_first_not_of(rDelimiters, pos);
pos = rStr.find_first_of(rDelimiters, lastPos);
}
}
#endif
