Hi Sebastian

That's great, thank you! I'll give it a try when I have the opportunity to 
update the director/SD again.

Best regards
Samuel

On Monday, March 25, 2024 at 3:46:42 PM UTC+1 Sebastian Sura wrote:

> Hi Samuel,
>
> i think i managed to reproduce your Problem.  I created a fix here if you 
> want to check it out: https://github.com/bareos/bareos/pull/1745
>
> Kind Regards
>
> Sebastian Sura
> Am 04.03.24 um 11:43 schrieb 'Samuel' via bareos-users:
>
> I finally had the chance to restart the storage daemon this morning in 
> order to disable autoxflate. 
> Today's copy jobs that ran afterwards finished without any warning, so I 
> think it's probably an issue with autoxflate (on replication?). I can also 
> successfully restore those new copies.
>
> Best regards,
> Samuel
> On Monday, February 26, 2024 at 12:19:50 PM UTC+1 Samuel wrote:
>
>> Yes, that's right. In this case though it's a consolidated incremental 
>> which was copied. However, the warnings also occur during migration jobs of 
>> consolidated fulls. The warning has never occurred while copying/migrating 
>> a non-consolidated incremental. 
>>
>> This is all working just fine except for a couple of jobs which all have 
>> the following in common: filedaemon is on another host than director, they 
>> use LZ4 compression and autoxflate inflates on replication (though it's 
>> also enabled for all the other jobs but doesn't have any effect since they 
>> don't use compression).
>>
>> I use the SQL Query selection type:
>> Selection Type = SQL Query
>> Selection Pattern = "
>>   SELECT DISTINCT Job.JobId, Job.StartTime
>>   FROM Job, Pool
>>   WHERE Pool.Name IN ('AI-Consolidated', 'AI-Incremental') AND
>>         Pool.PoolId = Job.PoolId AND
>>         Job.Type = 'B' AND
>>         Job.JobStatus IN ('T', 'W') AND
>>         Job.Level IN ('I') AND
>>         Job.JobBytes > 0 AND
>>         Job.JobId NOT IN (
>>             SELECT PriorJobId
>>             FROM Job AS copy
>>             WHERE copy.Type IN ('B', 'C', 'A') AND
>>                   copy.JobStatus IN ('T', 'W') AND
>>                   copy.PriorJobId != 0
>>         )
>>   ORDER by Job.StartTime;"
>>
>> No, there's no concurrent writing going on currently. Before updating to 
>> Bareos 23 I did have devices with concurrent writing enabled, but that was 
>> before these warnings happened.
>>
>> Best regards,
>> Samuel
>>
>> On Monday, February 26, 2024 at 11:51:59 AM UTC+1 Sebastian Sura wrote:
>>
>>> Hi Samuel
>>>
>>> i just wanted to make sure i completely understand the situation.  You 
>>> have an always incremental job with compression
>>> turned on (which algorithm do you use ?) and then you have a copy job 
>>> that copies the consolidated full job (on the same sd)
>>> to tape with a device that auto inflates on write.
>>>
>>> Do i have that right ?
>>>
>>> Does the device where the job gets consolidated onto allow concurrent 
>>> writes from multiple jobs ?
>>>
>>> What selection type has your copy job ?
>>>
>>> Kind Regards
>>> Sebastian Sura
>>> Am 26.02.24 um 11:32 schrieb 'Samuel' via bareos-users:
>>>
>>> Hi again. 
>>>
>>> So, as expected, the warning occurred again when job 94511 was copied 
>>> (but also two new warnings for this specific job):
>>> Warning: dird/catreq.cc:608 MD5 digest not same File=3 as attributes=2
>>> Warning: dird/catreq.cc:608 MD5 digest not same File=18 as attributes=17
>>> Warning: dird/catreq.cc:608 MD5 digest not same File=23 as attributes=22
>>>
>>> The restore of the copy also errored in the same manner as before:
>>>  2024-02-26 10:43:50 bareos-fd JobId 95784: Error: 
>>> findlib/attribs.cc:381 File size of restored file 
>>> /tmp/bareos-restores/var/backup/old/2024-02-07_23:30/backup_www_1100CC.tar.gz
>>>  
>>> not correct. Original 738212473, restored 442105856.
>>>  2024-02-26 10:44:10 bareos-fd JobId 95784: Error: 
>>> findlib/attribs.cc:381 File size of restored file 
>>> /tmp/bareos-restores/var/backup/old/2024-02-12_23:30/backup_www_1100CC.tar.gz
>>>  
>>> not correct. Original 856739265, restored 783548416.
>>>  2024-02-26 10:44:17 bareos-fd JobId 95784: Error: 
>>> findlib/attribs.cc:381 File size of restored file 
>>> /tmp/bareos-restores/var/backup/old/2024-02-14_23:30/backup_www_1100CC.tar.gz
>>>  
>>> not correct. Original 856739984, restored 31064064.
>>>
>>> For now I'll try disabling autoxflate and see if that solves the issue.
>>>
>>> Best regards,
>>> Samuel
>>>
>>>
>>> On Friday, February 23, 2024 at 2:02:28 PM UTC+1 Samuel wrote:
>>>
>>>> Thanks for looking into it!
>>>>
>>>> Hm, there seems to be no file with fileindex=3 in the database for the 
>>>> copy:
>>>> Enter SQL query: select * from file where jobid=94146 and fileindex=3;
>>>> No results to list. 
>>>>
>>>> All files that are being backed up are just regular files, nothing 
>>>> special.
>>>>
>>>> The copy runs locally on the director's host from its SD to the same SD.
>>>>
>>>> Here's file 2 and 3 of today's consolidated incremental backup which 
>>>> will be copied tomorrow and most likely lead to the same warning again as 
>>>> it has for the last few days (for this specific fileset it always warns 
>>>> about `File=3 attributes=2`):
>>>>
>>>> Enter SQL query: select *, decode_lstat(lstat) from file where 
>>>> jobid=94511 and fileindex=2;
>>>>
>>>> +-------------+-----------+-------+---------+----------+--------+--------+--------+----------------------------------------------------------------+------------------------+--------------------------+----------------------------------------------------------------------------------------------+
>>>> | fileid      | fileindex | jobid | pathid  | deltaseq | markid | 
>>>> fhinfo | fhnode | lstat                                                    
>>>>  
>>>>      | md5                    | name                     | decode_lstat    
>>>>  
>>>>                                                                            
>>>>  
>>>> |
>>>>
>>>> +-------------+-----------+-------+---------+----------+--------+--------+--------+----------------------------------------------------------------+------------------------+--------------------------+----------------------------------------------------------------------------------------------+
>>>> | 484,575,603 |         2 | 94511 | 910,901 |        0 |      0 | 0     
>>>>  | 0      | P0D CgA4 IGk B A i A sADp5 BAA FgAo BlxAt1 BlxASF BlxVYH A A d 
>>>> | D/VouduY5TF4KFacjRE7Hw | backup_www_1100CC.tar.gz | 
>>>> (64771,655416,33188,1,0,34,0,738212473,4096,1441832,1707346805,1707345029,1707431431,0,0,29)
>>>>  
>>>> |
>>>>
>>>> +-------------+-----------+-------+---------+----------+--------+--------+--------+----------------------------------------------------------------+------------------------+--------------------------+----------------------------------------------------------------------------------------------+
>>>>
>>>> Enter SQL query: select *, decode_lstat(lstat) from file where 
>>>> jobid=94511 and fileindex=3;
>>>>
>>>> +-------------+-----------+-------+---------+----------+--------+--------+--------+---------------------------------------------------------------+------------------------+-----------------------------+-------------------------------------------------------------------------------------------+
>>>> | fileid      | fileindex | jobid | pathid  | deltaseq | markid | 
>>>> fhinfo | fhnode | lstat                                                    
>>>>  
>>>>     | md5                    | name                        | decode_lstat  
>>>>  
>>>>                                                                            
>>>> |
>>>>
>>>> +-------------+-----------+-------+---------+----------+--------+--------+--------+---------------------------------------------------------------+------------------------+-----------------------------+-------------------------------------------------------------------------------------------+
>>>> | 484,575,604 |         3 | 94511 | 910,901 |        0 |      0 | 0     
>>>>  | 0      | P0D CgA5 IGk B A i A BA9gA BAA IHw BlxAt8 BlxASH BlxVYH A A d 
>>>> | 
>>>> J9h+OdR/6XCqhqpJdJvYtw | backup_databases_1100CC.tar | 
>>>> (64771,655417,33188,1,0,34,0,17029120,4096,33264,1707346812,1707345031,1707431431,0,0,29)
>>>>  
>>>> |
>>>>
>>>> +-------------+-----------+-------+---------+----------+--------+--------+--------+---------------------------------------------------------------+------------------------+-----------------------------+-------------------------------------------------------------------------------------------+
>>>>
>>>>
>>>> Restore of this job (jobid=94511) works just fine:
>>>>
>>>> 23-Feb 13:44 bareos-sd JobId 94559: Releasing device 
>>>> "FileDevice-ReadOnly-0005" (/backup_1/bareos).
>>>> 23-Feb 13:44 bareos-dir JobId 94559: Max configured use duration=82,800 
>>>> sec. exceeded. Marking Volume "AI-Consolidated-1771" as Used.
>>>> 23-Feb 13:44 bareos-dir JobId 94559: Bareos bareos-dir 
>>>> 23.0.2~pre32.0a0e55739 (31Jan24):
>>>>   Build OS:               Ubuntu 20.04.5 LTS
>>>>   JobId:                  94559
>>>>   Job:                    Restore-Files.2024-02-23_13.43.47_38
>>>>   Restore Client:         "bareos-fd" 23.0.2~pre32.0a0e55739 (31Jan24) 
>>>> Ubuntu 20.04.5 LTS,ubuntu
>>>>   Start time:             23-Feb-2024 13:43:49
>>>>   End time:               23-Feb-2024 13:44:20
>>>>   Elapsed time:           31 secs
>>>>   Files Expected:         33
>>>>   Files Restored:         33
>>>>   Bytes Restored:         9,143,514,060
>>>>   Rate:                   294952.1 KB/s
>>>>   FD Errors:              0
>>>>   FD termination status:  OK
>>>>   SD termination status:  OK
>>>>   Bareos binary info:     Bareos community build (UNSUPPORTED): Get 
>>>> professional support from https://www.bareos.com
>>>>   Job triggered by:       User
>>>>   Termination:            Restore OK
>>>>
>>>> I'll also try restoring its copy from tape once it has been copied.
>>>>
>>>> Best regards,
>>>> Samuel
>>>> On Friday, February 23, 2024 at 1:16:32 PM UTC+1 Sebastian Sura wrote:
>>>>
>>>>> I meant to say File 3 has no stream 1, sorry for the confusion!
>>>>> Am 23.02.24 um 13:15 schrieb Sebastian Sura:
>>>>>
>>>>> Hi Samuel
>>>>>
>>>>> thanks for the gathering this info.  Ill look into the bscan issue as 
>>>>> well though i think this might be a known issue. 
>>>>> Regardless, regarding your actual issue: The bscan output shows that 
>>>>> File 3 is missing its attributes (it has no stream 3).
>>>>> This confuses the director as it never got told that file 3 now 
>>>>> started getting backed up and this is why you get the warning
>>>>> message from the director.
>>>>>
>>>>> The same is happening during the restore: the filedaemon never got 
>>>>> told that now a new file was started (because it never got the attribute 
>>>>> stream)
>>>>> so it basically merged both File 2 and File 3 into one file.  Even 
>>>>> with this in mind, i think the restore should have caught that and issues 
>>>>> a 
>>>>> warning.
>>>>> Ill look into why this did not happen.
>>>>>
>>>>> If you add up all the data records (stream=2) with fileid=2 and 
>>>>> fileid=3, you will get 738212473 bytes, which is exactly what the 
>>>>> filedaemon reported
>>>>> as size for file 2.  Since the log contains the size it expected i 
>>>>> imagine that you could manually restore file 3 by splitting file 2 into 
>>>>> two.
>>>>>
>>>>> Can you check which file fileid 3 corresponds to ? Is it a special 
>>>>> kind of file or just a normal one ?
>>>>>
>>>>> Ill try to see if i can reproduce your issue in the copy system test.  
>>>>> Do you do a local copy (so copy to the same sd) or a remote one?
>>>>>
>>>>> Kind Regards
>>>>>
>>>>> Sebastian Sura
>>>>> Am 23.02.24 um 12:56 schrieb 'Samuel' via bareos-users:
>>>>>
>>>>> Hi Sebastian,
>>>>>
>>>>> After looking at the logs some more I noticed that all jobs with this 
>>>>> warning have in common that they're using compression (LZ4) and 
>>>>> autoxflate 
>>>>> on replication.
>>>>> Perhaps this combination is still not entirely fixed yet. I'll try 
>>>>> disabling autoxflate in the coming days.
>>>>>
>>>>> The entire result of bscan is in the attachment. However here's the 
>>>>> last part of it where bscan seems to abort (?):
>>>>> $ sudo -u bareos bscan -b md5_digest_error_copy_only.bsr 
>>>>> --list-records TapeDevice2 2>&1 | tee records.txt
>>>>> ...
>>>>> bscan: stored/bscan.cc:494-0 Record: SessId=3480 SessTim=1707222936 
>>>>> FileIndex=23 Stream=2 len=65536
>>>>> bscan: stored/bscan.cc:494-0 Record: SessId=3480 SessTim=1707222936 
>>>>> FileIndex=23 Stream=2 len=65536
>>>>> bscan: stored/bscan.cc:494-0 Record: SessId=3480 SessTim=1707222936 
>>>>> FileIndex=23 Stream=2 len=65536
>>>>> bscan: stored/bscan.cc:494-0 Record: SessId=3480 SessTim=1707222936 
>>>>> FileIndex=23 Stream=2 len=22528
>>>>> bscan: stored/bscan.cc:494-0 Record: SessId=3480 SessTim=1707222936 
>>>>> FileIndex=23 Stream=3 len=16
>>>>> bscan: stored/bscan.cc:494-0 Record: SessId=3480 SessTim=1707222936 
>>>>> FileIndex=24 Stream=1 len=100
>>>>> bscan: stored/bscan.cc:494-0 Record: SessId=3480 SessTim=1707222936 
>>>>> FileIndex=25 Stream=1 len=83
>>>>> bscan: stored/bscan.cc:494-0 Record: SessId=3480 SessTim=1707222936 
>>>>> FileIndex=-5 Stream=94145 len=193
>>>>> bscan: stored/bscan.cc:681-0 Could not find SessId=3480 
>>>>> SessTime=1707222936 for EOS record.
>>>>> Records would have been added or updated in the catalog:
>>>>>       0 Media
>>>>>       1 Pool
>>>>>       0 Job
>>>>>       0 File
>>>>>       0 RestoreObject
>>>>> 23-Feb 12:05 bscan JobId 0: Releasing device "TapeDevice2" 
>>>>> (/dev/tape/by-id/scsi-35000e111c71ac0bf-nst).
>>>>>
>>>>> I also tried restoring the copy which ends in error.
>>>>>
>>>>> Most files are restored successfully except for one file which happens 
>>>>> to have FileIndex=2 (corresponding to the `attributes=2` in the warning?):
>>>>> Enter SQL query: select path,name from file,path where jobid=94146 and 
>>>>> fileindex=2 and file.pathid=path.pathid;
>>>>> +-----------------------------------+--------------------------+
>>>>> | path                              | name                     |
>>>>> +-----------------------------------+--------------------------+
>>>>> | /var/backup/old/2024-02-07_23:30/ | backup_www_1100CC.tar.gz |
>>>>> +-----------------------------------+--------------------------+
>>>>>
>>>>> A full restore of the original non-copy backup jobs works fine, as do 
>>>>> restores of other copy jobs on tape of filesets that don't use 
>>>>> compression.
>>>>> The original job that was copied (93682) doesn't exist anymore; I 
>>>>> think it was an always-incremental consolidated incremental backup which 
>>>>> got consolidated again the following day.
>>>>>
>>>>> Best regards,
>>>>> Samuel
>>>>> -- 
>>>>> You received this message because you are subscribed to the Google 
>>>>> Groups "bareos-users" group.
>>>>> To unsubscribe from this group and stop receiving emails from it, send 
>>>>> an email to [email protected].
>>>>> To view this discussion on the web visit 
>>>>> https://groups.google.com/d/msgid/bareos-users/e766e8c6-5b58-424f-816f-f7c5d7252180n%40googlegroups.com
>>>>>  
>>>>> <https://groups.google.com/d/msgid/bareos-users/e766e8c6-5b58-424f-816f-f7c5d7252180n%40googlegroups.com?utm_medium=email&utm_source=footer>
>>>>> .
>>>>>
>>>>> -- 
>>>>>  Sebastian Sura                  [email protected]
>>>>>  Bareos GmbH & Co. KG            Phone: +49 221 630693-0
>>>>>  https://www.bareos.com
>>>>>  Sitz der Gesellschaft: Köln | Amtsgericht Köln: HRA 29646
>>>>>  Komplementär: Bareos Verwaltungs-GmbH
>>>>>  Geschäftsführer: Stephan Dühr, Jörg Steffens, Philipp Storz
>>>>>
>>>>> -- 
>>>>> You received this message because you are subscribed to the Google 
>>>>> Groups "bareos-users" group.
>>>>> To unsubscribe from this group and stop receiving emails from it, send 
>>>>> an email to [email protected].
>>>>>
>>>>> To view this discussion on the web visit 
>>>>> https://groups.google.com/d/msgid/bareos-users/111cecfe-d3e1-4eb9-8a0e-f053a2777d0f%40bareos.com
>>>>>  
>>>>> <https://groups.google.com/d/msgid/bareos-users/111cecfe-d3e1-4eb9-8a0e-f053a2777d0f%40bareos.com?utm_medium=email&utm_source=footer>
>>>>> .
>>>>>
>>>>> -- 
>>>>>  Sebastian Sura                  [email protected]
>>>>>  Bareos GmbH & Co. KG            Phone: +49 221 630693-0
>>>>>  https://www.bareos.com
>>>>>  Sitz der Gesellschaft: Köln | Amtsgericht Köln: HRA 29646
>>>>>  Komplementär: Bareos Verwaltungs-GmbH
>>>>>  Geschäftsführer: Stephan Dühr, Jörg Steffens, Philipp Storz
>>>>>
>>>>> -- 
>>> You received this message because you are subscribed to the Google 
>>> Groups "bareos-users" group.
>>> To unsubscribe from this group and stop receiving emails from it, send 
>>> an email to [email protected].
>>>
>>> To view this discussion on the web visit 
>>> https://groups.google.com/d/msgid/bareos-users/3823a22d-b49e-4892-8b45-6c932ce7e699n%40googlegroups.com
>>>  
>>> <https://groups.google.com/d/msgid/bareos-users/3823a22d-b49e-4892-8b45-6c932ce7e699n%40googlegroups.com?utm_medium=email&utm_source=footer>
>>> .
>>>
>>> -- 
>>>  Sebastian Sura                  [email protected]
>>>  Bareos GmbH & Co. KG            Phone: +49 221 630693-0
>>>  https://www.bareos.com
>>>  Sitz der Gesellschaft: Köln | Amtsgericht Köln: HRA 29646
>>>  Komplementär: Bareos Verwaltungs-GmbH
>>>  Geschäftsführer: Stephan Dühr, Jörg Steffens, Philipp Storz
>>>
>>> -- 
> You received this message because you are subscribed to the Google Groups 
> "bareos-users" group.
> To unsubscribe from this group and stop receiving emails from it, send an 
> email to [email protected].
>
> To view this discussion on the web visit 
> https://groups.google.com/d/msgid/bareos-users/67455f46-6182-4fb6-9be1-e8969ae0fb3dn%40googlegroups.com
>  
> <https://groups.google.com/d/msgid/bareos-users/67455f46-6182-4fb6-9be1-e8969ae0fb3dn%40googlegroups.com?utm_medium=email&utm_source=footer>
> .
>
> -- 
>  Sebastian Sura                  [email protected]
>  Bareos GmbH & Co. KG            Phone: +49 221 630693-0
>  https://www.bareos.com
>  Sitz der Gesellschaft: Köln | Amtsgericht Köln: HRA 29646
>  Komplementär: Bareos Verwaltungs-GmbH
>  Geschäftsführer: Stephan Dühr, Jörg Steffens, Philipp Storz
>
>

-- 
You received this message because you are subscribed to the Google Groups 
"bareos-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/bareos-users/00197b2d-e30d-4728-9441-a3fea6d7fde7n%40googlegroups.com.

Reply via email to