Bumping this post to see if anyone has any suggestions on my the attached is failing import.
Thank you in advance. On Wednesday, September 9, 2020 at 11:02:01 PM UTC-4, Andrew M wrote: > > > I do appreciate the assistance. Hopefully this will help others in either > with what I am doing or identifying a problem with the import process. > > I have attached a copy of one Weather Display log file I am having issues > with (82013lg.txt.zip) as well as one I did not have any issues with > (52013lg.txt.zip). > Also attached is output from wee_debug > > I am not seeing any differences in the data between these files that is > causing 82013lg.txt to choke. > > ~20 files out of ~400 files had an issue. > > Please let me know if any additional information is needed. > > > > From running wee_import on 82013lg.txt > pi@weather:/var/tmp $ wee_import --import-config=/var/tmp/wd.conf > --dry-run --verbose > Using WeeWX configuration file /etc/weewx/weewx.conf > Starting wee_import... > Weather Display monthly log files in the '/var/tmp/WD' directory will be > imported > The following options will be used: > config=/etc/weewx/weewx.conf, import-config=/var/tmp/wd.conf > from=None, to=None > dry-run=True, calc_missing=False, ignore_invalid_data=True > monthly logs are in US units > tranche=300, interval=60 > UV=True, radiation=True ignore extreme temperature and humidity=True > Using database binding 'wx_binding', which is bound to database > 'roundhillvaw_archive' > Destination table 'archive' unit system is '0x01' (US). > This is a dry run, imported data will not be saved to archive. > Starting dry run import ... > Records covering multiple periods have been identified for import. > Obtaining raw import data for period 1 ... > Traceback (most recent call last): > File "/usr/share/weewx/wee_import", line 900, in <module> > main() > File "/usr/share/weewx/wee_import", line 830, in main > source_obj.run() > File "/usr/share/weewx/weeimport/weeimport.py", line 368, in run > _raw_data = self.getRawData(period) > File "/usr/share/weewx/weeimport/wdimport.py", line 568, in getRawData > for rec in _reader: > File "/usr/lib/python3.7/csv.py", line 112, in __next__ > row = next(self.reader) > _csv.Error: line contains NULL byte > > > > From log file > pi@weather:~ $ sudo tail -f /var/log/weewx.log > Sep 9 22:39:01 weather wee_import[16538] DEBUG weeimport.wdimport: The > following options will be used: > Sep 9 22:39:01 weather wee_import[16538] DEBUG weeimport.wdimport: > config=/etc/weewx/weewx.conf, import-config=/var/tmp/wd.conf > Sep 9 22:39:01 weather wee_import[16538] DEBUG weeimport.wdimport: > from=None, to=None > Sep 9 22:39:01 weather wee_import[16538] DEBUG weeimport.wdimport: > dry-run=True, calc_missing=False, ignore_invalid_data=True > Sep 9 22:39:01 weather wee_import[16538] DEBUG weeimport.wdimport: > monthly logs are in US units > Sep 9 22:39:01 weather wee_import[16538] DEBUG weeimport.wdimport: > tranche=300, interval=60 > Sep 9 22:39:01 weather wee_import[16538] DEBUG weeimport.wdimport: > UV=True, radiation=True ignore extreme temperature and humidity=True > Sep 9 22:39:01 weather wee_import[16538] INFO weeimport.wdimport: Using > database binding 'wx_binding', which is bound to database > 'roundhillvaw_archive' > Sep 9 22:39:01 weather wee_import[16538] INFO weeimport.wdimport: > Destination table 'archive' unit system is '0x01' (US). > Sep 9 22:39:01 weather wee_import[16538] INFO weeimport.weeimport: > Obtaining raw import data for period 1 ... > Sep 9 22:42:24 weather wee_import[16712] INFO __main__: Starting > wee_import... > Sep 9 22:42:30 weather wee_import[16712] DEBUG weewx.manager: Daily > summary version is 2.0 > Sep 9 22:42:30 weather wee_import[16712] INFO weeimport.wdimport: Weather > Display monthly log files in the '/var/tmp/WD' directory will be imported > Sep 9 22:42:30 weather wee_import[16712] DEBUG weeimport.wdimport: The > following options will be used: > Sep 9 22:42:30 weather wee_import[16712] DEBUG weeimport.wdimport: > config=/etc/weewx/weewx.conf, import-config=/var/tmp/wd.conf > Sep 9 22:42:30 weather wee_import[16712] DEBUG weeimport.wdimport: > from=None, to=None > Sep 9 22:42:30 weather wee_import[16712] DEBUG weeimport.wdimport: > dry-run=True, calc_missing=False, ignore_invalid_data=True > Sep 9 22:42:30 weather wee_import[16712] DEBUG weeimport.wdimport: > monthly logs are in US units > Sep 9 22:42:30 weather wee_import[16712] DEBUG weeimport.wdimport: > tranche=300, interval=60 > Sep 9 22:42:30 weather wee_import[16712] DEBUG weeimport.wdimport: > UV=True, radiation=True ignore extreme temperature and humidity=True > Sep 9 22:42:30 weather wee_import[16712] INFO weeimport.wdimport: Using > database binding 'wx_binding', which is bound to database > 'roundhillvaw_archive' > Sep 9 22:42:30 weather wee_import[16712] INFO weeimport.wdimport: > Destination table 'archive' unit system is '0x01' (US). > Sep 9 22:42:30 weather wee_import[16712] INFO weeimport.weeimport: > Obtaining raw import data for period 1 ... > Sep 9 22:42:36 weather wee_import[16714] INFO __main__: Starting > wee_import... > Sep 9 22:42:42 weather wee_import[16714] DEBUG weewx.manager: Daily > summary version is 2.0 > Sep 9 22:42:42 weather wee_import[16714] INFO weeimport.wdimport: Weather > Display monthly log files in the '/var/tmp/WD' directory will be imported > Sep 9 22:42:42 weather wee_import[16714] DEBUG weeimport.wdimport: The > following options will be used: > Sep 9 22:42:42 weather wee_import[16714] DEBUG weeimport.wdimport: > config=/etc/weewx/weewx.conf, import-config=/var/tmp/wd.conf > Sep 9 22:42:42 weather wee_import[16714] DEBUG weeimport.wdimport: > from=None, to=None > Sep 9 22:42:42 weather wee_import[16714] DEBUG weeimport.wdimport: > dry-run=True, calc_missing=False, ignore_invalid_data=True > Sep 9 22:42:42 weather wee_import[16714] DEBUG weeimport.wdimport: > monthly logs are in US units > Sep 9 22:42:42 weather wee_import[16714] DEBUG weeimport.wdimport: > tranche=300, interval=60 > Sep 9 22:42:42 weather wee_import[16714] DEBUG weeimport.wdimport: > UV=True, radiation=True ignore extreme temperature and humidity=True > Sep 9 22:42:42 weather wee_import[16714] INFO weeimport.wdimport: Using > database binding 'wx_binding', which is bound to database > 'roundhillvaw_archive' > Sep 9 22:42:42 weather wee_import[16714] INFO weeimport.wdimport: > Destination table 'archive' unit system is '0x01' (US). > Sep 9 22:42:42 weather wee_import[16714] INFO weeimport.weeimport: > Obtaining raw import data for period 1 ... > > > > > > > On Tuesday, September 1, 2020 at 7:12:29 PM UTC-4, gjr80 wrote: >> >> Didn’t take your post as a complaint, sorry if I came across in a manner >> that gave that impression. Just wanted to point out that development of the >> WD module was based on a very small sample of data rather than from some >> written specification. If the problem was a malformed data file that caused >> wee_import to abort we may be able to harden wee_import skip the malformed >> lines. If the problem is a new format/structure in the data file that >> wee_import rejected then we may need to rework the WD module to handle this >> new format/structure. >> >> If you want to send any relevant log entries/errors and a copy of a >> misbehaving data file by direct email thats fine by me. >> >> Gary >> >> On Wednesday, 2 September 2020 at 08:15:51 UTC+10 Andrew M wrote: >> >>> Please don't take what i have written as a complaint. >>> I appreciate all the hard work it took in the creation of WeeWx and all >>> the contributors to the forum. >>> >>> I have not looked at the content of those files it choked on to see if >>> it has good data in it. Let me first do that. I will reply back once I have >>> done that. >>> >>> Thank you. >>> >>> andrew >>> >>> On Tuesday, September 1, 2020 at 5:20:30 PM UTC-4 gjr80 wrote: >>> >>>> If you to provide some details on problems the handful of files >>>> experienced I am happy to look at wee_import to see if any changes can be >>>> made to improve its handling of such files. The WD import module of >>>> wee_import was developed based on a handful of WD log files found on the >>>> internet, so it is quite possible there are some corner cases that may >>>> cause wee_import to reject a file. >>>> >>>> Gary >>>> >>>> On Wednesday, 2 September 2020 at 07:08:20 UTC+10 Andrew M wrote: >>>> >>>>> After many, many hours the wee-import processed all but a handful >>>>> of files. Have several that choked on the import. >>>>> Ended up with 3,873,678 records. >>>>> >>>>> Next I need to validate that the data imported to look for any >>>>> potential bad data. >>>>> >>>>> >>>>> >>>>> On Friday, August 21, 2020 at 7:31:36 PM UTC-4 Andrew M wrote: >>>>> >>>>>> Thank you for the response. >>>>>> >>>>>> I actually changed up how I am going about this. My WD ran on a >>>>>> Windows10 box. I had a RaspPi box sitting in a box so i decided to use >>>>>> that >>>>>> for WeeWx. >>>>>> I was formatting the hard drive on the Win10 machine when i had the >>>>>> thought that I should just put Debian on that and use that for WeeWx. >>>>>> Which I did. This machine is a little faster that the RaspPi , so one >>>>>> I have other things straight I will use that for the conversion. >>>>>> Migrating >>>>>> the MySql from WD to the tables that WeeWX has is probably doable, but >>>>>> really don't want to sit down and think about it that much. I will just >>>>>> let >>>>>> the current weather collect on the RaspPi box and then once the import >>>>>> is >>>>>> done on the now Debian box I can combine the two DB much easier. >>>>>> >>>>>> I have 60k rows on the MySQL DB with my current webhost, but more >>>>>> than that saved external to that on a hard drive from a previous webhost >>>>>> that I never moved. >>>>>> >>>>>> >>>>>> >>>>>> On Wednesday, August 19, 2020 at 8:37:29 PM UTC-4 gjr80 wrote: >>>>>> >>>>>>> Sorry can't help you with MySQL to MySQL migration. >>>>>>> >>>>>>> Regards wee_import though, yes it can be slow. When I wrote the WD >>>>>>> import module the WD user who first used it in anger had something like >>>>>>> (from memory) 10 years of data to import. The import had to be done in >>>>>>> batches (again from memory) of 2-3 years each (wee_import uses >>>>>>> transactions on the database but does keep track of duplicate >>>>>>> timestamps >>>>>>> and a few other things so memory usage does grow as the span of the >>>>>>> import >>>>>>> grows). I found one email from the user with the results of the first >>>>>>> batch >>>>>>> import and 1.4 millions records were imported on a Raspberry Pi in 58 >>>>>>> minutes. Al things considered I find that reasonable. >>>>>>> >>>>>>> There are a couple of things you can do to speed up wee_import. You >>>>>>> can tweak a the tranche setting in the import config file, this >>>>>>> alters the size of the transactions (in records) that wee_import uses. >>>>>>> The >>>>>>> default is 250, you could raise this which will result in fewer db >>>>>>> transactions but it will likely increase memory usage so you may need >>>>>>> to do >>>>>>> the import in smaller batches. One other approach if using a slow(ish) >>>>>>> RPi >>>>>>> as your WeeWX machine is to do just the import on a faster machine and >>>>>>> then >>>>>>> copy the imported data to the WeeWX RPi. Granted this is simpler when >>>>>>> using >>>>>>> SQLite but depending on your setup could be adapted for MySQL. >>>>>>> >>>>>>> You say you have 60 000 odd MySQL records, that does not seem like >>>>>>> much, how does that correlate with the number of entries in the WD log >>>>>>> files? >>>>>>> >>>>>>> Gary >>>>>>> >>>>>>> On Thursday, 20 August 2020 08:48:46 UTC+10, Andrew M wrote: >>>>>>>> >>>>>>>> I started to use the wee_import process to process all my WD log >>>>>>>> files to WeeWx MySqlDB, and it was taking a long time. It seems like >>>>>>>> If I >>>>>>>> have many years of data it will take that long to import them into >>>>>>>> WeeWx >>>>>>>> MySqlDB. >>>>>>>> >>>>>>>> I then thought, oh wait, i already have the WD data in a MySql DB, >>>>>>>> so why am i doing this process. >>>>>>>> >>>>>>>> Now I have to figure out how I can gracefully import all the WD >>>>>>>> data I have in a MySQL DB to the one I set up for WeeWx. Both are on >>>>>>>> the >>>>>>>> same hosted server. Different DB names. >>>>>>>> >>>>>>>> There is one table for WD and multiple tables for WeeWx so have no >>>>>>>> idea on where to begin with this. I have ~60,000 rows of data in the >>>>>>>> WD >>>>>>>> MySQL DB table. >>>>>>>> >>>>>>>> Does anyone have a graceful way of migrating the WD MySQL DB into a >>>>>>>> WeeWX DB? >>>>>>>> >>>>>>>> Am I overlooking something in the documentation or in group? >>>>>>>> >>>>>>>> Thank you. >>>>>>>> >>>>>>>> >>>>>>>> -- You received this message because you are subscribed to the Google Groups "weewx-user" group. To unsubscribe from this group and stop receiving emails from it, send an email to weewx-user+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/weewx-user/0448e8b2-94d9-44d6-84bc-5ce7a701ec65o%40googlegroups.com.