Re: [wsjt-devel] ALL.TXT (again)
On 7/9/19 12:46 PM, Greg Beam wrote: Hi Greg, Personally, I don't have a need for it either. If all I am after is QSO validation: grep + awk or a good quality text editor is all that's needed :-) Exactly my opinion ! However, if one wants to do any sort of data analysis, the flat file format is less than ideal. Normalizing the data would go a long way toward shrinking the footprint, however, it also adds a level of complexity some may not find very pleasant. It's depending of what we want to analyse. As an example, if we want to analyse the dT, a little parsing and extraction program followed by a statistics program like R are sufficient. Having the data stored in a database, would give us no advantage, as long as the database management system is not coupled itself to such a statistics software. I am aware, and have done a bit of parsing in the past regarding the varying data structures of the file. It's changed a a good bit over the last few versions. Exactly ! I think that a little program in perl is sufficient to extract the needed information. Of course, other languages are possible too. Best wishes, Claude (DJ0OT) ___ wsjt-devel mailing list wsjt-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/wsjt-devel
Re: [wsjt-devel] ALL.TXT (again)
Hi Claude, Personally, I don't have a need for it either. If all I am after is QSO validation: grep + awk or a good quality text editor is all that's needed :-) However, if one wants to do any sort of data analysis, the flat file format is less than ideal. Normalizing the data would go a long way toward shrinking the footprint, however, it also adds a level of complexity some may not find very pleasant. I am aware, and have done a bit of parsing in the past regarding the varying data structures of the file. It's changed a a good bit over the last few versions. In any case, it's a doable thing if one wants to store their history / data (for whatever reason). IoT devices often use these variable-structure data sets with great success. The variance in the ALL.TXT file is minimal compared to some I've seen. 73's Greg, KI7MT On 7/9/19 4:11 AM, Claude Frantz wrote: On 7/9/19 10:58 AM, Greg Beam wrote: Hi Greg, I think parsing the lines into fields is the best long-term solution for storage (allows for much better indexing). This make only sense if you can assign a clear definition to every field. In the case of ALL.TXT, not all lines have the same structure. Especially, the date and the time can be in different lines, depending on the used WSJTX release. However, we'd need to do a fair bit of checking on each line to determine its structure first; This is essential. To be honest, I cannot see the advantage of having the data, stored in ALL.TXT, in a database. Myself, I'm using ALL.TXT only to verify strange situations or to verify a situation where I get a QSO confirmation for a QSO not in the log. A database is very useful when we often need access to the data and when we need rather complex queries. These requirements do not apply to ALL.TXT. Remember, the database management software needs disk space and processing time. Best wishes, Claude (DJ0OT) ___ wsjt-devel mailing list wsjt-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/wsjt-devel
Re: [wsjt-devel] ALL.TXT (again)
On 7/9/19 10:58 AM, Greg Beam wrote: Hi Greg, I think parsing the lines into fields is the best long-term solution for storage (allows for much better indexing). This make only sense if you can assign a clear definition to every field. In the case of ALL.TXT, not all lines have the same structure. Especially, the date and the time can be in different lines, depending on the used WSJTX release. However, we'd need to do a fair bit of checking on each line to determine its structure first; This is essential. To be honest, I cannot see the advantage of having the data, stored in ALL.TXT, in a database. Myself, I'm using ALL.TXT only to verify strange situations or to verify a situation where I get a QSO confirmation for a QSO not in the log. A database is very useful when we often need access to the data and when we need rather complex queries. These requirements do not apply to ALL.TXT. Remember, the database management software needs disk space and processing time. Best wishes, Claude (DJ0OT) ___ wsjt-devel mailing list wsjt-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/wsjt-devel
Re: [wsjt-devel] ALL.TXT (again)
Hi Mike, I finally had some time to tinker with this some. On my Win laptop using WSL, MSYS2, or MonaXterm (Cygwin port), import performance is not impressive by any means. I suspect files system interoperability is the root cause. Searches, once imported, are fine. On my Linux workstation imports are extremely fast; note: I'm running MongoDB on an XFS partition with an abundance of Ram (MongoDB is Ram piggy). I tinkered around with (3) scripts: Import Script (atimport.sh) Link: http://paste.ubuntu.com/p/bh5vGB5GPW/ Call Sign Lookup (atcheck.sh) Link: http://paste.ubuntu.com/p/NHv9tztx2Z/ Parse line into fields (retest.py) Link: https://paste.ubuntu.com/p/8HdPcTVXpN/ I think parsing the lines into fields is the best long-term solution for storage (allows for much better indexing). However, we'd need to do a fair bit of checking on each line to determine its structure first; check == "slower imports". Once the main (first) bulk import is done, incremental updates would be snappy. I'm sure there are far better solutions; like a UDP service that just pushes the decodes straight into a database (of any kind) :-) 73's Greg, KI7MT On 7/1/19 3:05 PM, Black Michael via wsjt-devel wrote: Please do post the code. Thanks Greg. de Mike W9MDB On Monday, July 1, 2019, 03:53:15 PM CDT, Greg Beam wrote: Hello All, Here's an example from today's log: Results: https://paste.ubuntu.com/p/382VVMth4S/ The query takes about (2) seconds or so using a $regex search on 7,390,224 logged events matching two callsigns; this is without being indexed nor field splitting. It is one string per line imported to a one field document in MongDB I can post the script I used as a gist, if anyone is interested: - Copies the ALL.TXT to $temp_file - Converts it to CSV - Generates two helper JS scripts - Generates one example JS query - Drops, then re-imports the alltext collection - Runs the a sample Query. Note: for incremental (daily) updates, there is no need to drop the collection (alltext) before inserting new events. I just do that for performance testing. It's a simple one-line query command that would work on Win/Linux/Mac: mongo < example.js You can, if desired, write any number of commands to perform stats, lookup(s), whatever, and use the same easy method to query the DB. However, as this is a single sting entry, much of the analytical value would be missing, as the fields / categories are not captured. 73's Greg, KI7MT On 7/1/19 1:27 AM, Claude Frantz wrote: > On 7/1/19 7:59 AM, Claude Frantz wrote: > > Just as an example of code extract in perl: > > if ($line =~ m/^(\d{4})-([A-Z][a-z]{2})-(\d{2})\b/ ) { > $day = $3 ; > $month_alpha = $2 ; > $year = $1 ; > } > elsif ($line =~ m/^(\d\d)(\d\d)(\d\d)_\d{6}\b/ ) { > $day = $3 ; > $month_num = $2 ; > $year = 2000 + $1 ; > } > elsif ($line =~ m/^(\d{4})-(\d\d)-(\d\d)\b/ ) { > $day = $3 ; > $month_num = $2 ; > $year = $1 ; > } > > I have not tested it, I hope there is no error. This allow to decode the > 3 formats of ALL.TXT about which ones I remember about. Please note that > the month can be numeric or alpha. If alpha, you have to convert to > numeric, if you want to compare to a numeric value. Please note also, > that the mode switching was an extra line in previous formats. > > Best wishes, > Claude (DJ0OT) ___ wsjt-devel mailing list wsjt-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/wsjt-devel
Re: [wsjt-devel] ALL.TXT (again)
Hi Dan, The example's I spoke of aren't running through a JS engine, they are being ran through Mongo itself to perform the queries. If you want to run native JS server side, you'd be better off install NodeJs or similar. There are a ton of PGP/MongoDB tutorials out there. The main difference being, you'd be using MongoClient through an ODM most likely. There's nothing to say one could not have multiple collections; one with a full string such as the example I did, and another that has each field broken out by field types. Breaking out the fields has a number of advantages, particularly for analytics. I'll tidy up the Bash script I use, and post it as a Gist. You could run it through MSYS2, Cygwin, or WSL even, as all should have access to Windows Local Servers. It would take a few changes for Path variables, but nothing major. I have Apache running on my server as it hosts some of testing environment. I'll may play around with a PHP form or two and see how that goes. 73's Greg, KI7MT On 7/1/19 3:45 PM, Dan Malcolm wrote: Because I want to search via my webserver, I have a separate PHP script that does the search. Probably not as fast as MongoDB. I do get good cohesive reports though. I get a report for example that will show me just one of my QSO's, and will store results in a text file. That makes it useful to refer to later, or to include in an email to my QSO partner. All that said I would like to explore MongoDB. The idea that query via js script may mean that I can still have private web access. I use IIS on Win10 and just the local machine can access it. __ Dan – K4SHQ -Original Message- From: Greg Beam [mailto:ki7m...@gmail.com] Sent: Monday, July 1, 2019 3:47 PM To: wsjt-devel@lists.sourceforge.net Subject: Re: [wsjt-devel] ALL.TXT (again) Hello All, Here's an example from today's log: Results: https://paste.ubuntu.com/p/382VVMth4S/ The query takes about (2) seconds or so using a $regex search on 7,390,224 logged events matching two callsigns; this is without being indexed nor field splitting. It is one string per line imported to a one field document in MongDB I can post the script I used as a gist, if anyone is interested: - Copies the ALL.TXT to $temp_file - Converts it to CSV - Generates two helper JS scripts - Generates one example JS query - Drops, then re-imports the alltext collection - Runs the a sample Query. Note: for incremental (daily) updates, there is no need to drop the collection (alltext) before inserting new events. I just do that for performance testing. It's a simple one-line query command that would work on Win/Linux/Mac: mongo < example.js You can, if desired, write any number of commands to perform stats, lookup(s), whatever, and use the same easy method to query the DB. However, as this is a single sting entry, much of the analytical value would be missing, as the fields / categories are not captured. 73's Greg, KI7MT ___ wsjt-devel mailing list wsjt-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/wsjt-devel
Re: [wsjt-devel] ALL.TXT (again)
Because I want to search via my webserver, I have a separate PHP script that does the search. Probably not as fast as MongoDB. I do get good cohesive reports though. I get a report for example that will show me just one of my QSO's, and will store results in a text file. That makes it useful to refer to later, or to include in an email to my QSO partner. All that said I would like to explore MongoDB. The idea that query via js script may mean that I can still have private web access. I use IIS on Win10 and just the local machine can access it. __ Dan – K4SHQ -Original Message- From: Greg Beam [mailto:ki7m...@gmail.com] Sent: Monday, July 1, 2019 3:47 PM To: wsjt-devel@lists.sourceforge.net Subject: Re: [wsjt-devel] ALL.TXT (again) Hello All, Here's an example from today's log: Results: https://paste.ubuntu.com/p/382VVMth4S/ The query takes about (2) seconds or so using a $regex search on 7,390,224 logged events matching two callsigns; this is without being indexed nor field splitting. It is one string per line imported to a one field document in MongDB I can post the script I used as a gist, if anyone is interested: - Copies the ALL.TXT to $temp_file - Converts it to CSV - Generates two helper JS scripts - Generates one example JS query - Drops, then re-imports the alltext collection - Runs the a sample Query. Note: for incremental (daily) updates, there is no need to drop the collection (alltext) before inserting new events. I just do that for performance testing. It's a simple one-line query command that would work on Win/Linux/Mac: mongo < example.js You can, if desired, write any number of commands to perform stats, lookup(s), whatever, and use the same easy method to query the DB. However, as this is a single sting entry, much of the analytical value would be missing, as the fields / categories are not captured. 73's Greg, KI7MT On 7/1/19 1:27 AM, Claude Frantz wrote: > On 7/1/19 7:59 AM, Claude Frantz wrote: > > Just as an example of code extract in perl: > > if ($line =~ m/^(\d{4})-([A-Z][a-z]{2})-(\d{2})\b/ ) { > $day = $3 ; > $month_alpha = $2 ; > $year = $1 ; > } > elsif ($line =~ m/^(\d\d)(\d\d)(\d\d)_\d{6}\b/ ) { > $day = $3 ; > $month_num = $2 ; > $year = 2000 + $1 ; > } > elsif ($line =~ m/^(\d{4})-(\d\d)-(\d\d)\b/ ) { > $day = $3 ; > $month_num = $2 ; > $year = $1 ; > } > > I have not tested it, I hope there is no error. This allow to decode > the > 3 formats of ALL.TXT about which ones I remember about. Please note > that the month can be numeric or alpha. If alpha, you have to convert > to numeric, if you want to compare to a numeric value. Please note > also, that the mode switching was an extra line in previous formats. > > Best wishes, > Claude (DJ0OT) > > > ___ > wsjt-devel mailing list > wsjt-devel@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/wsjt-devel ___ wsjt-devel mailing list wsjt-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/wsjt-devel ___ wsjt-devel mailing list wsjt-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/wsjt-devel
Re: [wsjt-devel] ALL.TXT (again)
Please do post the code. Thanks Greg. de Mike W9MDB On Monday, July 1, 2019, 03:53:15 PM CDT, Greg Beam wrote: Hello All, Here's an example from today's log: Results: https://paste.ubuntu.com/p/382VVMth4S/ The query takes about (2) seconds or so using a $regex search on 7,390,224 logged events matching two callsigns; this is without being indexed nor field splitting. It is one string per line imported to a one field document in MongDB I can post the script I used as a gist, if anyone is interested: - Copies the ALL.TXT to $temp_file - Converts it to CSV - Generates two helper JS scripts - Generates one example JS query - Drops, then re-imports the alltext collection - Runs the a sample Query. Note: for incremental (daily) updates, there is no need to drop the collection (alltext) before inserting new events. I just do that for performance testing. It's a simple one-line query command that would work on Win/Linux/Mac: mongo < example.js You can, if desired, write any number of commands to perform stats, lookup(s), whatever, and use the same easy method to query the DB. However, as this is a single sting entry, much of the analytical value would be missing, as the fields / categories are not captured. 73's Greg, KI7MT On 7/1/19 1:27 AM, Claude Frantz wrote: > On 7/1/19 7:59 AM, Claude Frantz wrote: > > Just as an example of code extract in perl: > > if ($line =~ m/^(\d{4})-([A-Z][a-z]{2})-(\d{2})\b/ ) { > $day = $3 ; > $month_alpha = $2 ; > $year = $1 ; > } > elsif ($line =~ m/^(\d\d)(\d\d)(\d\d)_\d{6}\b/ ) { > $day = $3 ; > $month_num = $2 ; > $year = 2000 + $1 ; > } > elsif ($line =~ m/^(\d{4})-(\d\d)-(\d\d)\b/ ) { > $day = $3 ; > $month_num = $2 ; > $year = $1 ; > } > > I have not tested it, I hope there is no error. This allow to decode the > 3 formats of ALL.TXT about which ones I remember about. Please note that > the month can be numeric or alpha. If alpha, you have to convert to > numeric, if you want to compare to a numeric value. Please note also, > that the mode switching was an extra line in previous formats. > > Best wishes, > Claude (DJ0OT) > > > ___ > wsjt-devel mailing list > wsjt-devel@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/wsjt-devel ___ wsjt-devel mailing list wsjt-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/wsjt-devel ___ wsjt-devel mailing list wsjt-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/wsjt-devel
Re: [wsjt-devel] ALL.TXT (again)
Hello All, Here's an example from today's log: Results: https://paste.ubuntu.com/p/382VVMth4S/ The query takes about (2) seconds or so using a $regex search on 7,390,224 logged events matching two callsigns; this is without being indexed nor field splitting. It is one string per line imported to a one field document in MongDB I can post the script I used as a gist, if anyone is interested: - Copies the ALL.TXT to $temp_file - Converts it to CSV - Generates two helper JS scripts - Generates one example JS query - Drops, then re-imports the alltext collection - Runs the a sample Query. Note: for incremental (daily) updates, there is no need to drop the collection (alltext) before inserting new events. I just do that for performance testing. It's a simple one-line query command that would work on Win/Linux/Mac: mongo < example.js You can, if desired, write any number of commands to perform stats, lookup(s), whatever, and use the same easy method to query the DB. However, as this is a single sting entry, much of the analytical value would be missing, as the fields / categories are not captured. 73's Greg, KI7MT On 7/1/19 1:27 AM, Claude Frantz wrote: On 7/1/19 7:59 AM, Claude Frantz wrote: Just as an example of code extract in perl: if ($line =~ m/^(\d{4})-([A-Z][a-z]{2})-(\d{2})\b/ ) { $day = $3 ; $month_alpha = $2 ; $year = $1 ; } elsif ($line =~ m/^(\d\d)(\d\d)(\d\d)_\d{6}\b/ ) { $day = $3 ; $month_num = $2 ; $year = 2000 + $1 ; } elsif ($line =~ m/^(\d{4})-(\d\d)-(\d\d)\b/ ) { $day = $3 ; $month_num = $2 ; $year = $1 ; } I have not tested it, I hope there is no error. This allow to decode the 3 formats of ALL.TXT about which ones I remember about. Please note that the month can be numeric or alpha. If alpha, you have to convert to numeric, if you want to compare to a numeric value. Please note also, that the mode switching was an extra line in previous formats. Best wishes, Claude (DJ0OT) ___ wsjt-devel mailing list wsjt-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/wsjt-devel ___ wsjt-devel mailing list wsjt-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/wsjt-devel
Re: [wsjt-devel] ALL.TXT (again)
Hello All, This is similar to how I parse the file also; read / split the line and check line[0], then do what's needed based on checking the first string. At present, my ALL.TXT is over 400MB. What I've been doing to prevent read lock issues is creating a daily diff file between a copy and the active ALL.TXT file; sticking each diff-file in folder to process at whatever time I wish without affecting WSJT-X operations: alltxt-diff-20190629-0300.txt alltxt-diff-20190630-0300.txt etc, etc, etc After each diff run, I update the copy so it's ready for the next day. There are hundreds of ways to accomplish the same thing, but, I found this to be easy and fairly painless (disk space is cheap these days :-) ) What to do with the data after has been my focus of late. I've been playing around with MongoDB (a schema-less JSON/BSON Document storage database) to sick the decoded lines in. You can either split the lines, or just stick the entire line in as a new document for long term storage/later-date access. The $regex processing capability of MongoDB is extensive, and very fast! One can easily parse a multitude of string combinations, even with the entire line in one field, for example: use wsjtx; db.alltxt.find( { $and: [ {event:{$regex:'MY-CALL'}}, {event:{$regex:'HIS-CALL'}} ] }); That would print the lines (documents) that contains both 'my-call' and 'his-call'. You could add ..'DATE_STRING' or any combination you wish to further refine the search without having to split the lines at all. In case folks are worried about the number of documents in each collection, I've added the entire WSPR Decode Archive (from WSPRnet) to a MongoDB Database/collection set (one collection for each year, 2008 thru 2019, at just over 95GB on disk size). Later collections have "millions" of decodes in them. Single collection Query Times are =< 1 to 2 seconds. With added indexing, times are in the Millisecond range :-) Aggregate queries, those spanning multiple collections/years, vary in time depending on the data being sought but are well within an acceptable time limit for most use cases I've had. 73's Greg, KI7MT On 7/1/19 1:27 AM, Claude Frantz wrote: On 7/1/19 7:59 AM, Claude Frantz wrote: Just as an example of code extract in perl: if ($line =~ m/^(\d{4})-([A-Z][a-z]{2})-(\d{2})\b/ ) { $day = $3 ; $month_alpha = $2 ; $year = $1 ; } elsif ($line =~ m/^(\d\d)(\d\d)(\d\d)_\d{6}\b/ ) { $day = $3 ; $month_num = $2 ; $year = 2000 + $1 ; } elsif ($line =~ m/^(\d{4})-(\d\d)-(\d\d)\b/ ) { $day = $3 ; $month_num = $2 ; $year = $1 ; } I have not tested it, I hope there is no error. This allow to decode the 3 formats of ALL.TXT about which ones I remember about. Please note that the month can be numeric or alpha. If alpha, you have to convert to numeric, if you want to compare to a numeric value. Please note also, that the mode switching was an extra line in previous formats. Best wishes, Claude (DJ0OT) ___ wsjt-devel mailing list wsjt-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/wsjt-devel ___ wsjt-devel mailing list wsjt-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/wsjt-devel
Re: [wsjt-devel] ALL.TXT (again)
On 7/1/19 7:59 AM, Claude Frantz wrote: Just as an example of code extract in perl: if ($line =~ m/^(\d{4})-([A-Z][a-z]{2})-(\d{2})\b/ ) { $day = $3 ; $month_alpha = $2 ; $year = $1 ; } elsif ($line =~ m/^(\d\d)(\d\d)(\d\d)_\d{6}\b/ ) { $day = $3 ; $month_num = $2 ; $year = 2000 + $1 ; } elsif ($line =~ m/^(\d{4})-(\d\d)-(\d\d)\b/ ) { $day = $3 ; $month_num = $2 ; $year = $1 ; } I have not tested it, I hope there is no error. This allow to decode the 3 formats of ALL.TXT about which ones I remember about. Please note that the month can be numeric or alpha. If alpha, you have to convert to numeric, if you want to compare to a numeric value. Please note also, that the mode switching was an extra line in previous formats. Best wishes, Claude (DJ0OT) ___ wsjt-devel mailing list wsjt-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/wsjt-devel
Re: [wsjt-devel] ALL.TXT (again)
On 6/30/19 10:18 PM, Dan Malcolm wrote: That’s correct Claude. But my PHP program has to deal with both formats in 2019. Given that one of the formats will be found, all I have to detect is a change in month, which comes after the date is harvested from the line (string). I suggest to try to match both formats, in sequence. When the one matches, you decode the date. When not, you try to match to the second format. When it matches, you extract the date. When there is no match, then the line contains another data and you should ignore it. Be sure to match at the beginning of the line. Note the "^" as the first character of the regex. Best wishes, Claude (DJ0OT) ___ wsjt-devel mailing list wsjt-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/wsjt-devel
Re: [wsjt-devel] ALL.TXT (again)
That’s correct Claude. But my PHP program has to deal with both formats in 2019. Given that one of the formats will be found, all I have to detect is a change in month, which comes after the date is harvested from the line (string). __ Dan – K4SHQ -Original Message- From: Claude Frantz [mailto:claude.fra...@bayern-mail.de] Sent: Sunday, June 30, 2019 12:50 PM To: wsjt-devel@lists.sourceforge.net Subject: Re: [wsjt-devel] ALL.TXT (again) On 6/30/19 5:41 PM, Dan Malcolm wrote: Hi Dan, > Good point Mike. Right now I’m using the PHP regex function and > “(\d{4})-(\d{2})-(\d{2})”. That worked until the format change. The > function returns a T/F status and sticks the result into an array. This doesn't match with the current format. > Claude recommended a regex '^\d{6}_\d{6}\b' to detect the new format. > I think both will work. The regex, which I have mentioned, works fine with the current format, but not with previous ones. Best wishes, Claude (DJ0OT) ___ wsjt-devel mailing list wsjt-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/wsjt-devel ___ wsjt-devel mailing list wsjt-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/wsjt-devel
Re: [wsjt-devel] ALL.TXT (again)
On 6/30/19 5:41 PM, Dan Malcolm wrote: Hi Dan, Good point Mike. Right now I’m using the PHP regex function and “(\d{4})-(\d{2})-(\d{2})”. That worked until the format change. The function returns a T/F status and sticks the result into an array. This doesn't match with the current format. Claude recommended a regex '^\d{6}_\d{6}\b' to detect the new format. I think both will work. The regex, which I have mentioned, works fine with the current format, but not with previous ones. Best wishes, Claude (DJ0OT) ___ wsjt-devel mailing list wsjt-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/wsjt-devel
Re: [wsjt-devel] ALL.TXT (again)
Good point Mike. Right now I’m using the PHP regex function and “(\d{4})-(\d{2})-(\d{2})”. That worked until the format change. The function returns a T/F status and sticks the result into an array. Claude recommended a regex '^\d{6}_\d{6}\b' to detect the new format. I think both will work. I’ll do some research to see which method would be quickest and to see if PHP has a ‘sscanf’ comparable function. I’ll have to use both for the time being as 2019’s ALL.TXT will have both formats. Thanks to both of you. __ Dan – K4SHQ From: Black Michael [mailto:mdblac...@yahoo.com] Sent: Sunday, June 30, 2019 7:53 AM To: 'WSJT software development' ; Dan Malcolm Subject: Re: [wsjt-devel] ALL.TXT (again) This logic should work until the format changes in 2099 or so unless it's just allowed to roll over in which case it will still work. You just need to get the current yymm instead of a "saved" date. If the yymm changes that will allow you to track any idle period correctly. #include #include int main(int nargs, char *argv[]) { int yymmsave = 0; char buf[256]; FILE *fp = fopen(argv[1], "r"); if (fp == NULL) { perror(argv[1]); exit(1); } while (fgets(buf, sizeof(buf), fp)) { int yymm, dd, time; int n = sscanf(buf, "%4d%2d_%6d", , , ); if (n == 3 && yymm != yymmsave) { printf("%04d\n", yymm); yymmsave = yymm; } } fclose(fp); } de Mike W9MDB On Saturday, June 29, 2019, 11:32:47 PM CDT, Dan Malcolm mailto:k4...@outlook.com>> wrote: I’m thinking that if I know I’m in year ‘19’ and the month changes from ’02 to GT ‘02’ then I can count that as a month change. Not likely but I can envision where I wounn’t be on the air for a month or more. __ Dan – K4SHQ From: Black Michael via wsjt-devel [mailto:wsjt-devel@lists.sourceforge.net] Sent: Saturday, June 29, 2019 10:39 PM To: wsjtx-devel mailto:wsjt-devel@lists.sourceforge.net>> Cc: Black Michael mailto:mdblac...@yahoo.com>> Subject: Re: [wsjt-devel] ALL.TXT (again) There's nothing special about the month rollover The format is YYMMDD so here's the Feb-Mar rollover in my file for example. 190228_23594514.074 Rx FT8 -3 0.9 2255 WB4HMA HC2AO -11 190301_0014.074 Rx FT8-13 0.9 2598 VE1DBM LU1JAO -08 de Mike W9MDB On Saturday, June 29, 2019, 09:35:19 PM CDT, Dan Malcolm mailto:k4...@outlook.com>> wrote: I am aware that ALL.TXT data formatting changed in late February this year. I am trying to write a PHP program to split All.TXT in monthly text files. I have program that does this for 2018, but it only finds January and February of 2019. The problem is probably a format change. Can anyone help me find the first entry of the month format? Is it MMDD_”time”? or something similar? __ Dan – K4SHQ ___ wsjt-devel mailing list wsjt-devel@lists.sourceforge.net<mailto:wsjt-devel@lists.sourceforge.net> https://lists.sourceforge.net/lists/listinfo/wsjt-devel ___ wsjt-devel mailing list wsjt-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/wsjt-devel
Re: [wsjt-devel] ALL.TXT (again)
Yup...sscanf exists in php. Most people don't know about the power of sscanf to parse fixed format data. The old format in ALL.TXT was like this 2017-01-01 00:00 7.076 MHz JT9+JT65 So this should work.int n = sscanf(buf,"%4d-%2d-%2d %2d:%2d",,);if (n == 5) parsed_ok..otherwise fall through. On Sunday, June 30, 2019, 10:41:24 AM CDT, Dan Malcolm wrote: Good point Mike. Right now I’m using the PHP regex function and “(\d{4})-(\d{2})-(\d{2})”. That worked until the format change. The function returns a T/F status and sticks the result into an array. Claude recommended aregex '^\d{6}_\d{6}\b' to detect the new format. I think both will work. I’ll do some research to see which method would be quickest and to see if PHP has a ‘sscanf’ comparable function. I’ll have to use both for the time being as 2019’s ALL.TXT will have both formats. Thanks to both of you. __ Dan – K4SHQ From: Black Michael [mailto:mdblac...@yahoo.com] Sent: Sunday, June 30, 2019 7:53 AM To: 'WSJT software development' ; Dan Malcolm Subject: Re: [wsjt-devel] ALL.TXT (again) This logic should work until the format changes in 2099 or so unless it's just allowed to roll over in which case it will still work. You just need to get the current yymm instead of a "saved" date. If the yymm changes that will allow you to track any idle period correctly. #include #include int main(int nargs, char *argv[]) { int yymmsave = 0; char buf[256]; FILE *fp = fopen(argv[1], "r"); if (fp == NULL) { perror(argv[1]); exit(1); } while (fgets(buf, sizeof(buf), fp)) { int yymm, dd, time; int n = sscanf(buf, "%4d%2d_%6d", , , ); if (n == 3 && yymm != yymmsave) { printf("%04d\n", yymm); yymmsave = yymm; } } fclose(fp); } de Mike W9MDB On Saturday, June 29, 2019, 11:32:47 PM CDT, Dan Malcolm wrote: I’m thinking that if I know I’m in year ‘19’ and the month changes from ’02 to GT ‘02’ then I can count that as a month change. Not likely but I can envision where I wounn’t be on the air for a month or more. __ Dan – K4SHQ From: Black Michael via wsjt-devel [mailto:wsjt-devel@lists.sourceforge.net] Sent: Saturday, June 29, 2019 10:39 PM To: wsjtx-devel Cc: Black Michael Subject: Re: [wsjt-devel] ALL.TXT (again) There's nothing special about the month rollover The format is YYMMDD so here's the Feb-Mar rollover in my file for example. 190228_235945 14.074 Rx FT8 -3 0.9 2255 WB4HMA HC2AO -11 190301_00 14.074 Rx FT8 -13 0.9 2598 VE1DBM LU1JAO -08 de Mike W9MDB On Saturday, June 29, 2019, 09:35:19 PM CDT, Dan Malcolm wrote: I am aware that ALL.TXT data formatting changed in late February this year. I am trying to write a PHP program to split All.TXT in monthly text files. I have program that does this for 2018, but it only finds January and February of 2019. The problem is probably a format change. Can anyone help me find the first entry of the month format? Is it MMDD_”time”? or something similar? __ Dan – K4SHQ ___ wsjt-devel mailing list wsjt-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/wsjt-devel ___ wsjt-devel mailing list wsjt-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/wsjt-devel
Re: [wsjt-devel] ALL.TXT (again)
This logic should work until the format changes in 2099 or so unless it's just allowed to roll over in which case it will still work.You just need to get the current yymm instead of a "saved" date.If the yymm changes that will allow you to track any idle period correctly. #include #include int main(int nargs, char *argv[]){ int yymmsave = 0; char buf[256]; FILE *fp = fopen(argv[1], "r"); if (fp == NULL) { perror(argv[1]); exit(1); } while (fgets(buf, sizeof(buf), fp)) { int yymm, dd, time; int n = sscanf(buf, "%4d%2d_%6d", , , ); if (n == 3 && yymm != yymmsave) { printf("%04d\n", yymm); yymmsave = yymm; } } fclose(fp);} de Mike W9MDB On Saturday, June 29, 2019, 11:32:47 PM CDT, Dan Malcolm wrote: I’m thinking that if I know I’m in year ‘19’ and the month changes from ’02 to GT ‘02’ then I can count that as a month change. Not likely but I can envision where I wounn’t be on the air for a month or more. __ Dan – K4SHQ From: Black Michael via wsjt-devel [mailto:wsjt-devel@lists.sourceforge.net] Sent: Saturday, June 29, 2019 10:39 PM To: wsjtx-devel Cc: Black Michael Subject: Re: [wsjt-devel] ALL.TXT (again) There's nothing special about the month rollover The format is YYMMDD so here's the Feb-Mar rollover in my file for example. 190228_235945 14.074 Rx FT8 -3 0.9 2255 WB4HMA HC2AO -11 190301_00 14.074 Rx FT8 -13 0.9 2598 VE1DBM LU1JAO -08 de Mike W9MDB On Saturday, June 29, 2019, 09:35:19 PM CDT, Dan Malcolm wrote: I am aware that ALL.TXT data formatting changed in late February this year. I am trying to write a PHP program to split All.TXT in monthly text files. I have program that does this for 2018, but it only finds January and February of 2019. The problem is probably a format change. Can anyone help me find the first entry of the month format? Is it MMDD_”time”? or something similar? __ Dan – K4SHQ ___ wsjt-devel mailing list wsjt-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/wsjt-devel ___ wsjt-devel mailing list wsjt-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/wsjt-devel
Re: [wsjt-devel] ALL.TXT (again)
On 6/30/19 6:32 AM, Dan Malcolm wrote: Hi Dan, I’m thinking that if I know I’m in year ‘19’ and the month changes from ’02 to GT ‘02’ then I can count that as a month change. Not likely but I can envision where I wounn’t be on the air for a month or more. I'm not sure, but I think that lines with different formats can still occur in ALL.TXT. I suggest to make sure that you only examine the lines matching the regex '^\d{6}_\d{6}\b' . Best wishes, Claude (DJ0OT) ___ wsjt-devel mailing list wsjt-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/wsjt-devel
Re: [wsjt-devel] ALL.TXT (again)
I’m thinking that if I know I’m in year ‘19’ and the month changes from ’02 to GT ‘02’ then I can count that as a month change. Not likely but I can envision where I wounn’t be on the air for a month or more. __ Dan – K4SHQ From: Black Michael via wsjt-devel [mailto:wsjt-devel@lists.sourceforge.net] Sent: Saturday, June 29, 2019 10:39 PM To: wsjtx-devel Cc: Black Michael Subject: Re: [wsjt-devel] ALL.TXT (again) There's nothing special about the month rollover The format is YYMMDD so here's the Feb-Mar rollover in my file for example. 190228_23594514.074 Rx FT8 -3 0.9 2255 WB4HMA HC2AO -11 190301_0014.074 Rx FT8-13 0.9 2598 VE1DBM LU1JAO -08 de Mike W9MDB On Saturday, June 29, 2019, 09:35:19 PM CDT, Dan Malcolm mailto:k4...@outlook.com>> wrote: I am aware that ALL.TXT data formatting changed in late February this year. I am trying to write a PHP program to split All.TXT in monthly text files. I have program that does this for 2018, but it only finds January and February of 2019. The problem is probably a format change. Can anyone help me find the first entry of the month format? Is it MMDD_”time”? or something similar? __ Dan – K4SHQ ___ wsjt-devel mailing list wsjt-devel@lists.sourceforge.net<mailto:wsjt-devel@lists.sourceforge.net> https://lists.sourceforge.net/lists/listinfo/wsjt-devel ___ wsjt-devel mailing list wsjt-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/wsjt-devel
Re: [wsjt-devel] ALL.TXT (again)
There's nothing special about the month rollover The format is YYMMDD so here's the Feb-Mar rollover in my file for example. 190228_235945 14.074 Rx FT8 -3 0.9 2255 WB4HMA HC2AO -11190301_00 14.074 Rx FT8 -13 0.9 2598 VE1DBM LU1JAO -08 de Mike W9MDB On Saturday, June 29, 2019, 09:35:19 PM CDT, Dan Malcolm wrote: I am aware that ALL.TXT data formatting changed in late February this year. I am trying to write a PHP program to split All.TXT in monthly text files. I have program that does this for 2018, but it only finds January and February of 2019. The problem is probably a format change. Can anyone help me find the first entry of the month format? Is it MMDD_”time”? or something similar? __ Dan – K4SHQ ___ wsjt-devel mailing list wsjt-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/wsjt-devel ___ wsjt-devel mailing list wsjt-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/wsjt-devel
[wsjt-devel] ALL.TXT (again)
I am aware that ALL.TXT data formatting changed in late February this year. I am trying to write a PHP program to split All.TXT in monthly text files. I have program that does this for 2018, but it only finds January and February of 2019. The problem is probably a format change. Can anyone help me find the first entry of the month format? Is it MMDD_”time”? or something similar? __ Dan – K4SHQ ___ wsjt-devel mailing list wsjt-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/wsjt-devel