All was good with the import with the new wdimport.py file. Appreciate all the development efforts that have been done.
Thank On Friday, September 18, 2020 at 7:19:47 AM UTC-4, gjr80 wrote: > > Andrew, > > Can you try the attached wdimport.py in place of your current version. To > install: > > 1. rename your current wdimport.py: > > $ mv > /usr/share/weewx/weeimport/wdimport.py > /usr/share/weewx/weeimport/wdimport_orig.py > > 2. download the attached wdimport.py to /usr/share/weewx/weeimport/ > > 3. run the import on the problem files again > > This version of wdimport.py checks each log file line for null bytes and > if found removes them before the line is parsed. Any such lines are listed > on the console and saved to log but given the number of lines being > imported they could be quite easy to miss in the log. The 82013 file had a > lot of duplicate timestamps (which are listed to screen at the end of the > import) so it is quite easy to miss the null byte line output to console as > well. > > Let us know how this goes and if successful I will include it in the next > release. > > Gary > > On Thursday, 17 September 2020 20:20:12 UTC+10, gjr80 wrote: >> >> Apologies Andrew, I misread and thought you provided the data files for >> use in improving wee_import WD imports rather than seeking help with the >> import. Other things have taken up my time, I will try to have a look >> Friday. >> >> Gary >> >> On Thursday, 17 September 2020 at 08:39:49 UTC+10 Andrew M wrote: >> >>> Bumping this post to see if anyone has any suggestions on my the >>> attached is failing import. >>> >>> Thank you in advance. >>> >>> >>> >>> On Wednesday, September 9, 2020 at 11:02:01 PM UTC-4, Andrew M wrote: >>>> >>>> >>>> I do appreciate the assistance. Hopefully this will help others in >>>> either with what I am doing or identifying a problem with the import >>>> process. >>>> >>>> I have attached a copy of one Weather Display log file I am having >>>> issues with (82013lg.txt.zip) as well as one I did not have any issues >>>> with >>>> (52013lg.txt.zip). >>>> Also attached is output from wee_debug >>>> >>>> I am not seeing any differences in the data between these files that is >>>> causing 82013lg.txt to choke. >>>> >>>> ~20 files out of ~400 files had an issue. >>>> >>>> Please let me know if any additional information is needed. >>>> >>>> >>>> >>>> From running wee_import on 82013lg.txt >>>> pi@weather:/var/tmp $ wee_import --import-config=/var/tmp/wd.conf >>>> --dry-run --verbose >>>> Using WeeWX configuration file /etc/weewx/weewx.conf >>>> Starting wee_import... >>>> Weather Display monthly log files in the '/var/tmp/WD' directory will >>>> be imported >>>> The following options will be used: >>>> config=/etc/weewx/weewx.conf, import-config=/var/tmp/wd.conf >>>> from=None, to=None >>>> dry-run=True, calc_missing=False, ignore_invalid_data=True >>>> monthly logs are in US units >>>> tranche=300, interval=60 >>>> UV=True, radiation=True ignore extreme temperature and >>>> humidity=True >>>> Using database binding 'wx_binding', which is bound to database >>>> 'roundhillvaw_archive' >>>> Destination table 'archive' unit system is '0x01' (US). >>>> This is a dry run, imported data will not be saved to archive. >>>> Starting dry run import ... >>>> Records covering multiple periods have been identified for import. >>>> Obtaining raw import data for period 1 ... >>>> Traceback (most recent call last): >>>> File "/usr/share/weewx/wee_import", line 900, in <module> >>>> main() >>>> File "/usr/share/weewx/wee_import", line 830, in main >>>> source_obj.run() >>>> File "/usr/share/weewx/weeimport/weeimport.py", line 368, in run >>>> _raw_data = self.getRawData(period) >>>> File "/usr/share/weewx/weeimport/wdimport.py", line 568, in getRawData >>>> for rec in _reader: >>>> File "/usr/lib/python3.7/csv.py", line 112, in __next__ >>>> row = next(self.reader) >>>> _csv.Error: line contains NULL byte >>>> >>>> >>>> >>>> From log file >>>> pi@weather:~ $ sudo tail -f /var/log/weewx.log >>>> Sep 9 22:39:01 weather wee_import[16538] DEBUG weeimport.wdimport: The >>>> following options will be used: >>>> Sep 9 22:39:01 weather wee_import[16538] DEBUG weeimport.wdimport: >>>> config=/etc/weewx/weewx.conf, import-config=/var/tmp/wd.conf >>>> Sep 9 22:39:01 weather wee_import[16538] DEBUG weeimport.wdimport: >>>> from=None, to=None >>>> Sep 9 22:39:01 weather wee_import[16538] DEBUG weeimport.wdimport: >>>> dry-run=True, calc_missing=False, ignore_invalid_data=True >>>> Sep 9 22:39:01 weather wee_import[16538] DEBUG weeimport.wdimport: >>>> monthly logs are in US units >>>> Sep 9 22:39:01 weather wee_import[16538] DEBUG weeimport.wdimport: >>>> tranche=300, interval=60 >>>> Sep 9 22:39:01 weather wee_import[16538] DEBUG weeimport.wdimport: >>>> UV=True, radiation=True ignore extreme temperature and humidity=True >>>> Sep 9 22:39:01 weather wee_import[16538] INFO weeimport.wdimport: >>>> Using database binding 'wx_binding', which is bound to database >>>> 'roundhillvaw_archive' >>>> Sep 9 22:39:01 weather wee_import[16538] INFO weeimport.wdimport: >>>> Destination table 'archive' unit system is '0x01' (US). >>>> Sep 9 22:39:01 weather wee_import[16538] INFO weeimport.weeimport: >>>> Obtaining raw import data for period 1 ... >>>> Sep 9 22:42:24 weather wee_import[16712] INFO __main__: Starting >>>> wee_import... >>>> Sep 9 22:42:30 weather wee_import[16712] DEBUG weewx.manager: Daily >>>> summary version is 2.0 >>>> Sep 9 22:42:30 weather wee_import[16712] INFO weeimport.wdimport: >>>> Weather Display monthly log files in the '/var/tmp/WD' directory will be >>>> imported >>>> Sep 9 22:42:30 weather wee_import[16712] DEBUG weeimport.wdimport: The >>>> following options will be used: >>>> Sep 9 22:42:30 weather wee_import[16712] DEBUG weeimport.wdimport: >>>> config=/etc/weewx/weewx.conf, import-config=/var/tmp/wd.conf >>>> Sep 9 22:42:30 weather wee_import[16712] DEBUG weeimport.wdimport: >>>> from=None, to=None >>>> Sep 9 22:42:30 weather wee_import[16712] DEBUG weeimport.wdimport: >>>> dry-run=True, calc_missing=False, ignore_invalid_data=True >>>> Sep 9 22:42:30 weather wee_import[16712] DEBUG weeimport.wdimport: >>>> monthly logs are in US units >>>> Sep 9 22:42:30 weather wee_import[16712] DEBUG weeimport.wdimport: >>>> tranche=300, interval=60 >>>> Sep 9 22:42:30 weather wee_import[16712] DEBUG weeimport.wdimport: >>>> UV=True, radiation=True ignore extreme temperature and humidity=True >>>> Sep 9 22:42:30 weather wee_import[16712] INFO weeimport.wdimport: >>>> Using database binding 'wx_binding', which is bound to database >>>> 'roundhillvaw_archive' >>>> Sep 9 22:42:30 weather wee_import[16712] INFO weeimport.wdimport: >>>> Destination table 'archive' unit system is '0x01' (US). >>>> Sep 9 22:42:30 weather wee_import[16712] INFO weeimport.weeimport: >>>> Obtaining raw import data for period 1 ... >>>> Sep 9 22:42:36 weather wee_import[16714] INFO __main__: Starting >>>> wee_import... >>>> Sep 9 22:42:42 weather wee_import[16714] DEBUG weewx.manager: Daily >>>> summary version is 2.0 >>>> Sep 9 22:42:42 weather wee_import[16714] INFO weeimport.wdimport: >>>> Weather Display monthly log files in the '/var/tmp/WD' directory will be >>>> imported >>>> Sep 9 22:42:42 weather wee_import[16714] DEBUG weeimport.wdimport: The >>>> following options will be used: >>>> Sep 9 22:42:42 weather wee_import[16714] DEBUG weeimport.wdimport: >>>> config=/etc/weewx/weewx.conf, import-config=/var/tmp/wd.conf >>>> Sep 9 22:42:42 weather wee_import[16714] DEBUG weeimport.wdimport: >>>> from=None, to=None >>>> Sep 9 22:42:42 weather wee_import[16714] DEBUG weeimport.wdimport: >>>> dry-run=True, calc_missing=False, ignore_invalid_data=True >>>> Sep 9 22:42:42 weather wee_import[16714] DEBUG weeimport.wdimport: >>>> monthly logs are in US units >>>> Sep 9 22:42:42 weather wee_import[16714] DEBUG weeimport.wdimport: >>>> tranche=300, interval=60 >>>> Sep 9 22:42:42 weather wee_import[16714] DEBUG weeimport.wdimport: >>>> UV=True, radiation=True ignore extreme temperature and humidity=True >>>> Sep 9 22:42:42 weather wee_import[16714] INFO weeimport.wdimport: >>>> Using database binding 'wx_binding', which is bound to database >>>> 'roundhillvaw_archive' >>>> Sep 9 22:42:42 weather wee_import[16714] INFO weeimport.wdimport: >>>> Destination table 'archive' unit system is '0x01' (US). >>>> Sep 9 22:42:42 weather wee_import[16714] INFO weeimport.weeimport: >>>> Obtaining raw import data for period 1 ... >>>> >>>> >>>> >>>> >>>> >>>> >>>> On Tuesday, September 1, 2020 at 7:12:29 PM UTC-4, gjr80 wrote: >>>>> >>>>> Didn’t take your post as a complaint, sorry if I came across in a >>>>> manner that gave that impression. Just wanted to point out that >>>>> development >>>>> of the WD module was based on a very small sample of data rather than >>>>> from >>>>> some written specification. If the problem was a malformed data file that >>>>> caused wee_import to abort we may be able to harden wee_import skip the >>>>> malformed lines. If the problem is a new format/structure in the data >>>>> file >>>>> that wee_import rejected then we may need to rework the WD module to >>>>> handle >>>>> this new format/structure. >>>>> >>>>> If you want to send any relevant log entries/errors and a copy of a >>>>> misbehaving data file by direct email thats fine by me. >>>>> >>>>> Gary >>>>> >>>>> On Wednesday, 2 September 2020 at 08:15:51 UTC+10 Andrew M wrote: >>>>> >>>>>> Please don't take what i have written as a complaint. >>>>>> I appreciate all the hard work it took in the creation of WeeWx and >>>>>> all the contributors to the forum. >>>>>> >>>>>> I have not looked at the content of those files it choked on to see >>>>>> if it has good data in it. Let me first do that. I will reply back once >>>>>> I >>>>>> have done that. >>>>>> >>>>>> Thank you. >>>>>> >>>>>> andrew >>>>>> >>>>>> On Tuesday, September 1, 2020 at 5:20:30 PM UTC-4 gjr80 wrote: >>>>>> >>>>>>> If you to provide some details on problems the handful of files >>>>>>> experienced I am happy to look at wee_import to see if any changes can >>>>>>> be >>>>>>> made to improve its handling of such files. The WD import module of >>>>>>> wee_import was developed based on a handful of WD log files found on >>>>>>> the >>>>>>> internet, so it is quite possible there are some corner cases that may >>>>>>> cause wee_import to reject a file. >>>>>>> >>>>>>> Gary >>>>>>> >>>>>>> On Wednesday, 2 September 2020 at 07:08:20 UTC+10 Andrew M wrote: >>>>>>> >>>>>>>> After many, many hours the wee-import processed all but a handful >>>>>>>> of files. Have several that choked on the import. >>>>>>>> Ended up with 3,873,678 records. >>>>>>>> >>>>>>>> Next I need to validate that the data imported to look for any >>>>>>>> potential bad data. >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> On Friday, August 21, 2020 at 7:31:36 PM UTC-4 Andrew M wrote: >>>>>>>> >>>>>>>>> Thank you for the response. >>>>>>>>> >>>>>>>>> I actually changed up how I am going about this. My WD ran on a >>>>>>>>> Windows10 box. I had a RaspPi box sitting in a box so i decided to >>>>>>>>> use that >>>>>>>>> for WeeWx. >>>>>>>>> I was formatting the hard drive on the Win10 machine when i had >>>>>>>>> the thought that I should just put Debian on that and use that for >>>>>>>>> WeeWx. >>>>>>>>> Which I did. This machine is a little faster that the RaspPi , so >>>>>>>>> one I have other things straight I will use that for the conversion. >>>>>>>>> Migrating the MySql from WD to the tables that WeeWX has is probably >>>>>>>>> doable, but really don't want to sit down and think about it that >>>>>>>>> much. I >>>>>>>>> will just let the current weather collect on the RaspPi box and then >>>>>>>>> once >>>>>>>>> the import is done on the now Debian box I can combine the two DB >>>>>>>>> much >>>>>>>>> easier. >>>>>>>>> >>>>>>>>> I have 60k rows on the MySQL DB with my current webhost, but more >>>>>>>>> than that saved external to that on a hard drive from a previous >>>>>>>>> webhost >>>>>>>>> that I never moved. >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> On Wednesday, August 19, 2020 at 8:37:29 PM UTC-4 gjr80 wrote: >>>>>>>>> >>>>>>>>>> Sorry can't help you with MySQL to MySQL migration. >>>>>>>>>> >>>>>>>>>> Regards wee_import though, yes it can be slow. When I wrote the >>>>>>>>>> WD import module the WD user who first used it in anger had >>>>>>>>>> something like >>>>>>>>>> (from memory) 10 years of data to import. The import had to be done >>>>>>>>>> in >>>>>>>>>> batches (again from memory) of 2-3 years each (wee_import uses >>>>>>>>>> transactions on the database but does keep track of duplicate >>>>>>>>>> timestamps >>>>>>>>>> and a few other things so memory usage does grow as the span of the >>>>>>>>>> import >>>>>>>>>> grows). I found one email from the user with the results of the >>>>>>>>>> first batch >>>>>>>>>> import and 1.4 millions records were imported on a Raspberry Pi in >>>>>>>>>> 58 >>>>>>>>>> minutes. Al things considered I find that reasonable. >>>>>>>>>> >>>>>>>>>> There are a couple of things you can do to speed up wee_import. >>>>>>>>>> You can tweak a the tranche setting in the import config file, >>>>>>>>>> this alters the size of the transactions (in records) that >>>>>>>>>> wee_import uses. >>>>>>>>>> The default is 250, you could raise this which will result in fewer >>>>>>>>>> db >>>>>>>>>> transactions but it will likely increase memory usage so you may >>>>>>>>>> need to do >>>>>>>>>> the import in smaller batches. One other approach if using a >>>>>>>>>> slow(ish) RPi >>>>>>>>>> as your WeeWX machine is to do just the import on a faster machine >>>>>>>>>> and then >>>>>>>>>> copy the imported data to the WeeWX RPi. Granted this is simpler >>>>>>>>>> when using >>>>>>>>>> SQLite but depending on your setup could be adapted for MySQL. >>>>>>>>>> >>>>>>>>>> You say you have 60 000 odd MySQL records, that does not seem >>>>>>>>>> like much, how does that correlate with the number of entries in the >>>>>>>>>> WD log >>>>>>>>>> files? >>>>>>>>>> >>>>>>>>>> Gary >>>>>>>>>> >>>>>>>>>> On Thursday, 20 August 2020 08:48:46 UTC+10, Andrew M wrote: >>>>>>>>>>> >>>>>>>>>>> I started to use the wee_import process to process all my WD log >>>>>>>>>>> files to WeeWx MySqlDB, and it was taking a long time. It seems >>>>>>>>>>> like If I >>>>>>>>>>> have many years of data it will take that long to import them into >>>>>>>>>>> WeeWx >>>>>>>>>>> MySqlDB. >>>>>>>>>>> >>>>>>>>>>> I then thought, oh wait, i already have the WD data in a MySql >>>>>>>>>>> DB, so why am i doing this process. >>>>>>>>>>> >>>>>>>>>>> Now I have to figure out how I can gracefully import all the WD >>>>>>>>>>> data I have in a MySQL DB to the one I set up for WeeWx. Both are >>>>>>>>>>> on the >>>>>>>>>>> same hosted server. Different DB names. >>>>>>>>>>> >>>>>>>>>>> There is one table for WD and multiple tables for WeeWx so have >>>>>>>>>>> no idea on where to begin with this. I have ~60,000 rows of data in >>>>>>>>>>> the WD >>>>>>>>>>> MySQL DB table. >>>>>>>>>>> >>>>>>>>>>> Does anyone have a graceful way of migrating the WD MySQL DB >>>>>>>>>>> into a WeeWX DB? >>>>>>>>>>> >>>>>>>>>>> Am I overlooking something in the documentation or in group? >>>>>>>>>>> >>>>>>>>>>> Thank you. >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> -- You received this message because you are subscribed to the Google Groups "weewx-user" group. To unsubscribe from this group and stop receiving emails from it, send an email to weewx-user+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/weewx-user/aeada2a9-d2d8-439d-a7da-8f4b58b4aa89o%40googlegroups.com.