All was good with the import with the new wdimport.py file. 

Appreciate all the development efforts that have been done. 

Thank 

On Friday, September 18, 2020 at 7:19:47 AM UTC-4, gjr80 wrote:
>
> Andrew,
>
> Can you try the attached wdimport.py in place of your current version. To 
> install:
>
> 1. rename your current wdimport.py:
>
> $ mv 
> /usr/share/weewx/weeimport/wdimport.py 
> /usr/share/weewx/weeimport/wdimport_orig.py
>
> 2. download the attached wdimport.py to /usr/share/weewx/weeimport/
>
> 3. run the import on the problem files again
>
> This version of wdimport.py checks each log file line for null bytes and 
> if found removes them before the line is parsed. Any such lines are listed 
> on the console and saved to log but given the number of lines being 
> imported they could be quite easy to miss in the log. The 82013 file had a 
> lot of duplicate timestamps (which are listed to screen at the end of the 
> import) so it is quite easy to miss the null byte line output to console as 
> well.
>
> Let us know how this goes and if successful I will include it in the next 
> release.
>
> Gary
>
> On Thursday, 17 September 2020 20:20:12 UTC+10, gjr80 wrote:
>>
>> Apologies Andrew, I misread and thought you provided the data files for 
>> use in improving wee_import WD imports rather than seeking help with the 
>> import. Other things have taken up my time, I will try to have a look 
>> Friday.
>>
>> Gary
>>
>> On Thursday, 17 September 2020 at 08:39:49 UTC+10 Andrew M wrote:
>>
>>> Bumping this post to see if anyone has any suggestions on my the 
>>> attached is failing import.
>>>
>>> Thank you in advance.
>>>
>>>
>>>
>>> On Wednesday, September 9, 2020 at 11:02:01 PM UTC-4, Andrew M wrote:
>>>>
>>>>
>>>> I do appreciate the assistance. Hopefully this will help others in 
>>>> either with what I am doing or identifying a problem with the import 
>>>> process.
>>>>
>>>> I have attached a copy of one Weather Display log file I am having 
>>>> issues with (82013lg.txt.zip) as well as one I did not have any issues 
>>>> with 
>>>> (52013lg.txt.zip).
>>>> Also attached is output from wee_debug
>>>>
>>>> I am not seeing any differences in the data between these files that is 
>>>> causing 82013lg.txt to choke.
>>>>
>>>> ~20 files out of ~400 files had an issue.
>>>>
>>>> Please let me know if any additional information is needed.
>>>>
>>>>
>>>>
>>>> From running wee_import on 82013lg.txt
>>>> pi@weather:/var/tmp $ wee_import --import-config=/var/tmp/wd.conf 
>>>> --dry-run --verbose
>>>> Using WeeWX configuration file /etc/weewx/weewx.conf
>>>> Starting wee_import...
>>>> Weather Display monthly log files in the '/var/tmp/WD' directory will 
>>>> be imported
>>>> The following options will be used:
>>>>      config=/etc/weewx/weewx.conf, import-config=/var/tmp/wd.conf
>>>>      from=None, to=None
>>>>      dry-run=True, calc_missing=False, ignore_invalid_data=True
>>>>      monthly logs are in US units
>>>>      tranche=300, interval=60
>>>>      UV=True, radiation=True ignore extreme temperature and 
>>>> humidity=True
>>>> Using database binding 'wx_binding', which is bound to database 
>>>> 'roundhillvaw_archive'
>>>> Destination table 'archive' unit system is '0x01' (US).
>>>> This is a dry run, imported data will not be saved to archive.
>>>> Starting dry run import ...
>>>> Records covering multiple periods have been identified for import.
>>>> Obtaining raw import data for period 1 ...
>>>> Traceback (most recent call last):
>>>>   File "/usr/share/weewx/wee_import", line 900, in <module>
>>>>     main()
>>>>   File "/usr/share/weewx/wee_import", line 830, in main
>>>>     source_obj.run()
>>>>   File "/usr/share/weewx/weeimport/weeimport.py", line 368, in run
>>>>     _raw_data = self.getRawData(period)
>>>>   File "/usr/share/weewx/weeimport/wdimport.py", line 568, in getRawData
>>>>     for rec in _reader:
>>>>   File "/usr/lib/python3.7/csv.py", line 112, in __next__
>>>>     row = next(self.reader)
>>>> _csv.Error: line contains NULL byte
>>>>
>>>>
>>>>
>>>> From log file
>>>> pi@weather:~ $ sudo tail -f /var/log/weewx.log
>>>> Sep  9 22:39:01 weather wee_import[16538] DEBUG weeimport.wdimport: The 
>>>> following options will be used:
>>>> Sep  9 22:39:01 weather wee_import[16538] DEBUG weeimport.wdimport:    
>>>>   config=/etc/weewx/weewx.conf, import-config=/var/tmp/wd.conf
>>>> Sep  9 22:39:01 weather wee_import[16538] DEBUG weeimport.wdimport:    
>>>>   from=None, to=None
>>>> Sep  9 22:39:01 weather wee_import[16538] DEBUG weeimport.wdimport:    
>>>>   dry-run=True, calc_missing=False, ignore_invalid_data=True
>>>> Sep  9 22:39:01 weather wee_import[16538] DEBUG weeimport.wdimport:    
>>>>   monthly logs are in US units
>>>> Sep  9 22:39:01 weather wee_import[16538] DEBUG weeimport.wdimport:    
>>>>   tranche=300, interval=60
>>>> Sep  9 22:39:01 weather wee_import[16538] DEBUG weeimport.wdimport:    
>>>>   UV=True, radiation=True ignore extreme temperature and humidity=True
>>>> Sep  9 22:39:01 weather wee_import[16538] INFO weeimport.wdimport: 
>>>> Using database binding 'wx_binding', which is bound to database 
>>>> 'roundhillvaw_archive'
>>>> Sep  9 22:39:01 weather wee_import[16538] INFO weeimport.wdimport: 
>>>> Destination table 'archive' unit system is '0x01' (US).
>>>> Sep  9 22:39:01 weather wee_import[16538] INFO weeimport.weeimport: 
>>>> Obtaining raw import data for period 1 ...
>>>> Sep  9 22:42:24 weather wee_import[16712] INFO __main__: Starting 
>>>> wee_import...
>>>> Sep  9 22:42:30 weather wee_import[16712] DEBUG weewx.manager: Daily 
>>>> summary version is 2.0
>>>> Sep  9 22:42:30 weather wee_import[16712] INFO weeimport.wdimport: 
>>>> Weather Display monthly log files in the '/var/tmp/WD' directory will be 
>>>> imported
>>>> Sep  9 22:42:30 weather wee_import[16712] DEBUG weeimport.wdimport: The 
>>>> following options will be used:
>>>> Sep  9 22:42:30 weather wee_import[16712] DEBUG weeimport.wdimport:    
>>>>   config=/etc/weewx/weewx.conf, import-config=/var/tmp/wd.conf
>>>> Sep  9 22:42:30 weather wee_import[16712] DEBUG weeimport.wdimport:    
>>>>   from=None, to=None
>>>> Sep  9 22:42:30 weather wee_import[16712] DEBUG weeimport.wdimport:    
>>>>   dry-run=True, calc_missing=False, ignore_invalid_data=True
>>>> Sep  9 22:42:30 weather wee_import[16712] DEBUG weeimport.wdimport:    
>>>>   monthly logs are in US units
>>>> Sep  9 22:42:30 weather wee_import[16712] DEBUG weeimport.wdimport:    
>>>>   tranche=300, interval=60
>>>> Sep  9 22:42:30 weather wee_import[16712] DEBUG weeimport.wdimport:    
>>>>   UV=True, radiation=True ignore extreme temperature and humidity=True
>>>> Sep  9 22:42:30 weather wee_import[16712] INFO weeimport.wdimport: 
>>>> Using database binding 'wx_binding', which is bound to database 
>>>> 'roundhillvaw_archive'
>>>> Sep  9 22:42:30 weather wee_import[16712] INFO weeimport.wdimport: 
>>>> Destination table 'archive' unit system is '0x01' (US).
>>>> Sep  9 22:42:30 weather wee_import[16712] INFO weeimport.weeimport: 
>>>> Obtaining raw import data for period 1 ...
>>>> Sep  9 22:42:36 weather wee_import[16714] INFO __main__: Starting 
>>>> wee_import...
>>>> Sep  9 22:42:42 weather wee_import[16714] DEBUG weewx.manager: Daily 
>>>> summary version is 2.0
>>>> Sep  9 22:42:42 weather wee_import[16714] INFO weeimport.wdimport: 
>>>> Weather Display monthly log files in the '/var/tmp/WD' directory will be 
>>>> imported
>>>> Sep  9 22:42:42 weather wee_import[16714] DEBUG weeimport.wdimport: The 
>>>> following options will be used:
>>>> Sep  9 22:42:42 weather wee_import[16714] DEBUG weeimport.wdimport:    
>>>>   config=/etc/weewx/weewx.conf, import-config=/var/tmp/wd.conf
>>>> Sep  9 22:42:42 weather wee_import[16714] DEBUG weeimport.wdimport:    
>>>>   from=None, to=None
>>>> Sep  9 22:42:42 weather wee_import[16714] DEBUG weeimport.wdimport:    
>>>>   dry-run=True, calc_missing=False, ignore_invalid_data=True
>>>> Sep  9 22:42:42 weather wee_import[16714] DEBUG weeimport.wdimport:    
>>>>   monthly logs are in US units
>>>> Sep  9 22:42:42 weather wee_import[16714] DEBUG weeimport.wdimport:    
>>>>   tranche=300, interval=60
>>>> Sep  9 22:42:42 weather wee_import[16714] DEBUG weeimport.wdimport:    
>>>>   UV=True, radiation=True ignore extreme temperature and humidity=True
>>>> Sep  9 22:42:42 weather wee_import[16714] INFO weeimport.wdimport: 
>>>> Using database binding 'wx_binding', which is bound to database 
>>>> 'roundhillvaw_archive'
>>>> Sep  9 22:42:42 weather wee_import[16714] INFO weeimport.wdimport: 
>>>> Destination table 'archive' unit system is '0x01' (US).
>>>> Sep  9 22:42:42 weather wee_import[16714] INFO weeimport.weeimport: 
>>>> Obtaining raw import data for period 1 ...
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> On Tuesday, September 1, 2020 at 7:12:29 PM UTC-4, gjr80 wrote:
>>>>>
>>>>> Didn’t take your post as a complaint, sorry if I came across in a 
>>>>> manner that gave that impression. Just wanted to point out that 
>>>>> development 
>>>>> of the WD module was based on a very small sample of data rather than 
>>>>> from 
>>>>> some written specification. If the problem was a malformed data file that 
>>>>> caused wee_import to abort we may be able to harden wee_import skip the 
>>>>> malformed lines. If the problem is a new format/structure in the data 
>>>>> file 
>>>>> that wee_import rejected then we may need to rework the WD module to 
>>>>> handle 
>>>>> this new format/structure.
>>>>>
>>>>> If you want to send any relevant log entries/errors and a copy of a 
>>>>> misbehaving data file by direct email thats fine by me.
>>>>>
>>>>> Gary
>>>>>
>>>>> On Wednesday, 2 September 2020 at 08:15:51 UTC+10 Andrew M wrote:
>>>>>
>>>>>> Please don't take what i have written as a complaint. 
>>>>>> I appreciate all the hard work it took in the creation of WeeWx and 
>>>>>> all the contributors to the forum.
>>>>>>
>>>>>> I have not looked at the content of those files it choked on to see 
>>>>>> if it has good data in it. Let me first do that. I will reply back once 
>>>>>> I 
>>>>>> have done that.
>>>>>>
>>>>>> Thank you.
>>>>>>
>>>>>> andrew
>>>>>>
>>>>>> On Tuesday, September 1, 2020 at 5:20:30 PM UTC-4 gjr80 wrote:
>>>>>>
>>>>>>> If you to provide some details on problems the handful of files 
>>>>>>> experienced I am happy to look at wee_import to see if any changes can 
>>>>>>> be 
>>>>>>> made to improve its handling of such files. The WD import module of 
>>>>>>> wee_import was developed based on a handful of WD log files found on 
>>>>>>> the 
>>>>>>> internet, so it is quite possible there are some corner cases that may 
>>>>>>> cause wee_import to reject a file.
>>>>>>>
>>>>>>> Gary
>>>>>>>
>>>>>>> On Wednesday, 2 September 2020 at 07:08:20 UTC+10 Andrew M wrote:
>>>>>>>
>>>>>>>> After many, many hours the wee-import processed all but a handful 
>>>>>>>> of files. Have several that choked on the import.
>>>>>>>> Ended up with 3,873,678 records.
>>>>>>>>
>>>>>>>> Next I need to validate that the data imported to look for any 
>>>>>>>> potential bad data.
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> On Friday, August 21, 2020 at 7:31:36 PM UTC-4 Andrew M wrote:
>>>>>>>>
>>>>>>>>> Thank you for the response.
>>>>>>>>>
>>>>>>>>> I actually changed up how I am going about this. My WD ran on a 
>>>>>>>>> Windows10 box. I had a RaspPi box sitting in a box so i decided to 
>>>>>>>>> use that 
>>>>>>>>> for WeeWx.
>>>>>>>>> I was formatting the hard drive on the Win10 machine when i had 
>>>>>>>>> the thought that I should just put Debian on that and use that for 
>>>>>>>>> WeeWx. 
>>>>>>>>> Which I did. This machine is a little faster that the RaspPi , so 
>>>>>>>>> one I have other things straight I will use that for the conversion. 
>>>>>>>>> Migrating the MySql from WD to the tables that WeeWX has is probably 
>>>>>>>>> doable, but really don't want to sit down and think about it that 
>>>>>>>>> much. I 
>>>>>>>>> will just let the current weather collect on the RaspPi box and then 
>>>>>>>>> once 
>>>>>>>>> the import is done on the now Debian box I can combine the two DB 
>>>>>>>>> much 
>>>>>>>>> easier.
>>>>>>>>>
>>>>>>>>> I have 60k rows on the MySQL DB with my current webhost, but more 
>>>>>>>>> than that saved external to that on a hard drive from a previous 
>>>>>>>>> webhost 
>>>>>>>>> that I never moved. 
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On Wednesday, August 19, 2020 at 8:37:29 PM UTC-4 gjr80 wrote:
>>>>>>>>>
>>>>>>>>>> Sorry can't help you with MySQL to MySQL migration. 
>>>>>>>>>>
>>>>>>>>>> Regards wee_import though, yes it can be slow. When I wrote the 
>>>>>>>>>> WD import module the WD user who first used it in anger had 
>>>>>>>>>> something like 
>>>>>>>>>> (from memory) 10 years of data to import. The import had to be done 
>>>>>>>>>> in 
>>>>>>>>>> batches (again from memory) of 2-3 years each (wee_import uses 
>>>>>>>>>> transactions on the database but does keep track of duplicate 
>>>>>>>>>> timestamps 
>>>>>>>>>> and a few other things so memory usage does grow as the span of the 
>>>>>>>>>> import 
>>>>>>>>>> grows). I found one email from the user with the results of the 
>>>>>>>>>> first batch 
>>>>>>>>>> import and 1.4 millions records were imported on a Raspberry Pi in 
>>>>>>>>>> 58 
>>>>>>>>>> minutes. Al things considered I find that reasonable.
>>>>>>>>>>
>>>>>>>>>> There are a couple of things you can do to speed up wee_import. 
>>>>>>>>>> You can tweak a the tranche setting in the import config file, 
>>>>>>>>>> this alters the size of the transactions (in records) that 
>>>>>>>>>> wee_import uses. 
>>>>>>>>>> The default is 250, you could raise this which will result in fewer 
>>>>>>>>>> db 
>>>>>>>>>> transactions but it will likely increase memory usage so you may 
>>>>>>>>>> need to do 
>>>>>>>>>> the import in smaller batches. One other approach if using a 
>>>>>>>>>> slow(ish) RPi 
>>>>>>>>>> as your WeeWX machine is to do just the import on a faster machine 
>>>>>>>>>> and then 
>>>>>>>>>> copy the imported data to the WeeWX RPi. Granted this is simpler 
>>>>>>>>>> when using 
>>>>>>>>>> SQLite but depending on your setup could be adapted for MySQL.
>>>>>>>>>>
>>>>>>>>>> You say you have 60 000 odd MySQL records, that does not seem 
>>>>>>>>>> like much, how does that correlate with the number of entries in the 
>>>>>>>>>> WD log 
>>>>>>>>>> files?
>>>>>>>>>>
>>>>>>>>>> Gary
>>>>>>>>>>
>>>>>>>>>> On Thursday, 20 August 2020 08:48:46 UTC+10, Andrew M wrote:
>>>>>>>>>>>
>>>>>>>>>>> I started to use the wee_import process to process all my WD log 
>>>>>>>>>>> files to WeeWx MySqlDB, and it was taking a long time. It seems 
>>>>>>>>>>> like If I 
>>>>>>>>>>> have many years of data it will take that long to import them into 
>>>>>>>>>>> WeeWx 
>>>>>>>>>>> MySqlDB.
>>>>>>>>>>>
>>>>>>>>>>> I then thought, oh wait, i already have the WD data in a MySql 
>>>>>>>>>>> DB, so why am i doing this process.
>>>>>>>>>>>
>>>>>>>>>>> Now I have to figure out how I can gracefully import all the WD 
>>>>>>>>>>> data I have in a MySQL DB to the one I set up for WeeWx. Both are 
>>>>>>>>>>> on the 
>>>>>>>>>>> same hosted server. Different DB names.
>>>>>>>>>>>
>>>>>>>>>>> There is one table for WD and multiple tables for WeeWx so  have 
>>>>>>>>>>> no idea on where to begin with this. I have ~60,000 rows of data in 
>>>>>>>>>>> the WD 
>>>>>>>>>>> MySQL DB table.
>>>>>>>>>>>
>>>>>>>>>>> Does anyone have a graceful way of migrating the WD MySQL DB 
>>>>>>>>>>> into a WeeWX DB?
>>>>>>>>>>>
>>>>>>>>>>> Am I overlooking something in the documentation or in group?
>>>>>>>>>>>
>>>>>>>>>>> Thank you.
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>

-- 
You received this message because you are subscribed to the Google Groups 
"weewx-user" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to weewx-user+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/weewx-user/aeada2a9-d2d8-439d-a7da-8f4b58b4aa89o%40googlegroups.com.

Reply via email to