I finally had the chance to restart the storage daemon this morning in 
order to disable autoxflate.
Today's copy jobs that ran afterwards finished without any warning, so I 
think it's probably an issue with autoxflate (on replication?). I can also 
successfully restore those new copies.

Best regards,
Samuel
On Monday, February 26, 2024 at 12:19:50 PM UTC+1 Samuel wrote:

> Yes, that's right. In this case though it's a consolidated incremental 
> which was copied. However, the warnings also occur during migration jobs of 
> consolidated fulls. The warning has never occurred while copying/migrating 
> a non-consolidated incremental.
>
> This is all working just fine except for a couple of jobs which all have 
> the following in common: filedaemon is on another host than director, they 
> use LZ4 compression and autoxflate inflates on replication (though it's 
> also enabled for all the other jobs but doesn't have any effect since they 
> don't use compression).
>
> I use the SQL Query selection type:
> Selection Type = SQL Query
> Selection Pattern = "
>   SELECT DISTINCT Job.JobId, Job.StartTime
>   FROM Job, Pool
>   WHERE Pool.Name IN ('AI-Consolidated', 'AI-Incremental') AND
>         Pool.PoolId = Job.PoolId AND
>         Job.Type = 'B' AND
>         Job.JobStatus IN ('T', 'W') AND
>         Job.Level IN ('I') AND
>         Job.JobBytes > 0 AND
>         Job.JobId NOT IN (
>             SELECT PriorJobId
>             FROM Job AS copy
>             WHERE copy.Type IN ('B', 'C', 'A') AND
>                   copy.JobStatus IN ('T', 'W') AND
>                   copy.PriorJobId != 0
>         )
>   ORDER by Job.StartTime;"
>
> No, there's no concurrent writing going on currently. Before updating to 
> Bareos 23 I did have devices with concurrent writing enabled, but that was 
> before these warnings happened.
>
> Best regards,
> Samuel
>
> On Monday, February 26, 2024 at 11:51:59 AM UTC+1 Sebastian Sura wrote:
>
>> Hi Samuel
>>
>> i just wanted to make sure i completely understand the situation.  You 
>> have an always incremental job with compression
>> turned on (which algorithm do you use ?) and then you have a copy job 
>> that copies the consolidated full job (on the same sd)
>> to tape with a device that auto inflates on write.
>>
>> Do i have that right ?
>>
>> Does the device where the job gets consolidated onto allow concurrent 
>> writes from multiple jobs ?
>>
>> What selection type has your copy job ?
>>
>> Kind Regards
>> Sebastian Sura
>> Am 26.02.24 um 11:32 schrieb 'Samuel' via bareos-users:
>>
>> Hi again. 
>>
>> So, as expected, the warning occurred again when job 94511 was copied 
>> (but also two new warnings for this specific job):
>> Warning: dird/catreq.cc:608 MD5 digest not same File=3 as attributes=2
>> Warning: dird/catreq.cc:608 MD5 digest not same File=18 as attributes=17
>> Warning: dird/catreq.cc:608 MD5 digest not same File=23 as attributes=22
>>
>> The restore of the copy also errored in the same manner as before:
>>  2024-02-26 10:43:50 bareos-fd JobId 95784: Error: findlib/attribs.cc:381 
>> File size of restored file 
>> /tmp/bareos-restores/var/backup/old/2024-02-07_23:30/backup_www_1100CC.tar.gz
>>  
>> not correct. Original 738212473, restored 442105856.
>>  2024-02-26 10:44:10 bareos-fd JobId 95784: Error: findlib/attribs.cc:381 
>> File size of restored file 
>> /tmp/bareos-restores/var/backup/old/2024-02-12_23:30/backup_www_1100CC.tar.gz
>>  
>> not correct. Original 856739265, restored 783548416.
>>  2024-02-26 10:44:17 bareos-fd JobId 95784: Error: findlib/attribs.cc:381 
>> File size of restored file 
>> /tmp/bareos-restores/var/backup/old/2024-02-14_23:30/backup_www_1100CC.tar.gz
>>  
>> not correct. Original 856739984, restored 31064064.
>>
>> For now I'll try disabling autoxflate and see if that solves the issue.
>>
>> Best regards,
>> Samuel
>>
>>
>> On Friday, February 23, 2024 at 2:02:28 PM UTC+1 Samuel wrote:
>>
>>> Thanks for looking into it!
>>>
>>> Hm, there seems to be no file with fileindex=3 in the database for the 
>>> copy:
>>> Enter SQL query: select * from file where jobid=94146 and fileindex=3;
>>> No results to list. 
>>>
>>> All files that are being backed up are just regular files, nothing 
>>> special.
>>>
>>> The copy runs locally on the director's host from its SD to the same SD.
>>>
>>> Here's file 2 and 3 of today's consolidated incremental backup which 
>>> will be copied tomorrow and most likely lead to the same warning again as 
>>> it has for the last few days (for this specific fileset it always warns 
>>> about `File=3 attributes=2`):
>>>
>>> Enter SQL query: select *, decode_lstat(lstat) from file where 
>>> jobid=94511 and fileindex=2;
>>>
>>> +-------------+-----------+-------+---------+----------+--------+--------+--------+----------------------------------------------------------------+------------------------+--------------------------+----------------------------------------------------------------------------------------------+
>>> | fileid      | fileindex | jobid | pathid  | deltaseq | markid | fhinfo 
>>> | fhnode | lstat                                                          | 
>>> md5                    | name                     | decode_lstat           
>>>                                                                       |
>>>
>>> +-------------+-----------+-------+---------+----------+--------+--------+--------+----------------------------------------------------------------+------------------------+--------------------------+----------------------------------------------------------------------------------------------+
>>> | 484,575,603 |         2 | 94511 | 910,901 |        0 |      0 | 0     
>>>  | 0      | P0D CgA4 IGk B A i A sADp5 BAA FgAo BlxAt1 BlxASF BlxVYH A A d 
>>> | D/VouduY5TF4KFacjRE7Hw | backup_www_1100CC.tar.gz | 
>>> (64771,655416,33188,1,0,34,0,738212473,4096,1441832,1707346805,1707345029,1707431431,0,0,29)
>>>  
>>> |
>>>
>>> +-------------+-----------+-------+---------+----------+--------+--------+--------+----------------------------------------------------------------+------------------------+--------------------------+----------------------------------------------------------------------------------------------+
>>>
>>> Enter SQL query: select *, decode_lstat(lstat) from file where 
>>> jobid=94511 and fileindex=3;
>>>
>>> +-------------+-----------+-------+---------+----------+--------+--------+--------+---------------------------------------------------------------+------------------------+-----------------------------+-------------------------------------------------------------------------------------------+
>>> | fileid      | fileindex | jobid | pathid  | deltaseq | markid | fhinfo 
>>> | fhnode | lstat                                                         | 
>>> md5                    | name                        | decode_lstat         
>>>                                                                      |
>>>
>>> +-------------+-----------+-------+---------+----------+--------+--------+--------+---------------------------------------------------------------+------------------------+-----------------------------+-------------------------------------------------------------------------------------------+
>>> | 484,575,604 |         3 | 94511 | 910,901 |        0 |      0 | 0     
>>>  | 0      | P0D CgA5 IGk B A i A BA9gA BAA IHw BlxAt8 BlxASH BlxVYH A A d | 
>>> J9h+OdR/6XCqhqpJdJvYtw | backup_databases_1100CC.tar | 
>>> (64771,655417,33188,1,0,34,0,17029120,4096,33264,1707346812,1707345031,1707431431,0,0,29)
>>>  
>>> |
>>>
>>> +-------------+-----------+-------+---------+----------+--------+--------+--------+---------------------------------------------------------------+------------------------+-----------------------------+-------------------------------------------------------------------------------------------+
>>>
>>>
>>> Restore of this job (jobid=94511) works just fine:
>>>
>>> 23-Feb 13:44 bareos-sd JobId 94559: Releasing device 
>>> "FileDevice-ReadOnly-0005" (/backup_1/bareos).
>>> 23-Feb 13:44 bareos-dir JobId 94559: Max configured use duration=82,800 
>>> sec. exceeded. Marking Volume "AI-Consolidated-1771" as Used.
>>> 23-Feb 13:44 bareos-dir JobId 94559: Bareos bareos-dir 
>>> 23.0.2~pre32.0a0e55739 (31Jan24):
>>>   Build OS:               Ubuntu 20.04.5 LTS
>>>   JobId:                  94559
>>>   Job:                    Restore-Files.2024-02-23_13.43.47_38
>>>   Restore Client:         "bareos-fd" 23.0.2~pre32.0a0e55739 (31Jan24) 
>>> Ubuntu 20.04.5 LTS,ubuntu
>>>   Start time:             23-Feb-2024 13:43:49
>>>   End time:               23-Feb-2024 13:44:20
>>>   Elapsed time:           31 secs
>>>   Files Expected:         33
>>>   Files Restored:         33
>>>   Bytes Restored:         9,143,514,060
>>>   Rate:                   294952.1 KB/s
>>>   FD Errors:              0
>>>   FD termination status:  OK
>>>   SD termination status:  OK
>>>   Bareos binary info:     Bareos community build (UNSUPPORTED): Get 
>>> professional support from https://www.bareos.com
>>>   Job triggered by:       User
>>>   Termination:            Restore OK
>>>
>>> I'll also try restoring its copy from tape once it has been copied.
>>>
>>> Best regards,
>>> Samuel
>>> On Friday, February 23, 2024 at 1:16:32 PM UTC+1 Sebastian Sura wrote:
>>>
>>>> I meant to say File 3 has no stream 1, sorry for the confusion!
>>>> Am 23.02.24 um 13:15 schrieb Sebastian Sura:
>>>>
>>>> Hi Samuel
>>>>
>>>> thanks for the gathering this info.  Ill look into the bscan issue as 
>>>> well though i think this might be a known issue. 
>>>> Regardless, regarding your actual issue: The bscan output shows that 
>>>> File 3 is missing its attributes (it has no stream 3).
>>>> This confuses the director as it never got told that file 3 now started 
>>>> getting backed up and this is why you get the warning
>>>> message from the director.
>>>>
>>>> The same is happening during the restore: the filedaemon never got told 
>>>> that now a new file was started (because it never got the attribute stream)
>>>> so it basically merged both File 2 and File 3 into one file.  Even with 
>>>> this in mind, i think the restore should have caught that and issues a 
>>>> warning.
>>>> Ill look into why this did not happen.
>>>>
>>>> If you add up all the data records (stream=2) with fileid=2 and 
>>>> fileid=3, you will get 738212473 bytes, which is exactly what the 
>>>> filedaemon reported
>>>> as size for file 2.  Since the log contains the size it expected i 
>>>> imagine that you could manually restore file 3 by splitting file 2 into 
>>>> two.
>>>>
>>>> Can you check which file fileid 3 corresponds to ? Is it a special kind 
>>>> of file or just a normal one ?
>>>>
>>>> Ill try to see if i can reproduce your issue in the copy system test.  
>>>> Do you do a local copy (so copy to the same sd) or a remote one?
>>>>
>>>> Kind Regards
>>>>
>>>> Sebastian Sura
>>>> Am 23.02.24 um 12:56 schrieb 'Samuel' via bareos-users:
>>>>
>>>> Hi Sebastian,
>>>>
>>>> After looking at the logs some more I noticed that all jobs with this 
>>>> warning have in common that they're using compression (LZ4) and autoxflate 
>>>> on replication.
>>>> Perhaps this combination is still not entirely fixed yet. I'll try 
>>>> disabling autoxflate in the coming days.
>>>>
>>>> The entire result of bscan is in the attachment. However here's the 
>>>> last part of it where bscan seems to abort (?):
>>>> $ sudo -u bareos bscan -b md5_digest_error_copy_only.bsr --list-records 
>>>> TapeDevice2 2>&1 | tee records.txt
>>>> ...
>>>> bscan: stored/bscan.cc:494-0 Record: SessId=3480 SessTim=1707222936 
>>>> FileIndex=23 Stream=2 len=65536
>>>> bscan: stored/bscan.cc:494-0 Record: SessId=3480 SessTim=1707222936 
>>>> FileIndex=23 Stream=2 len=65536
>>>> bscan: stored/bscan.cc:494-0 Record: SessId=3480 SessTim=1707222936 
>>>> FileIndex=23 Stream=2 len=65536
>>>> bscan: stored/bscan.cc:494-0 Record: SessId=3480 SessTim=1707222936 
>>>> FileIndex=23 Stream=2 len=22528
>>>> bscan: stored/bscan.cc:494-0 Record: SessId=3480 SessTim=1707222936 
>>>> FileIndex=23 Stream=3 len=16
>>>> bscan: stored/bscan.cc:494-0 Record: SessId=3480 SessTim=1707222936 
>>>> FileIndex=24 Stream=1 len=100
>>>> bscan: stored/bscan.cc:494-0 Record: SessId=3480 SessTim=1707222936 
>>>> FileIndex=25 Stream=1 len=83
>>>> bscan: stored/bscan.cc:494-0 Record: SessId=3480 SessTim=1707222936 
>>>> FileIndex=-5 Stream=94145 len=193
>>>> bscan: stored/bscan.cc:681-0 Could not find SessId=3480 
>>>> SessTime=1707222936 for EOS record.
>>>> Records would have been added or updated in the catalog:
>>>>       0 Media
>>>>       1 Pool
>>>>       0 Job
>>>>       0 File
>>>>       0 RestoreObject
>>>> 23-Feb 12:05 bscan JobId 0: Releasing device "TapeDevice2" 
>>>> (/dev/tape/by-id/scsi-35000e111c71ac0bf-nst).
>>>>
>>>> I also tried restoring the copy which ends in error.
>>>>
>>>> Most files are restored successfully except for one file which happens 
>>>> to have FileIndex=2 (corresponding to the `attributes=2` in the warning?):
>>>> Enter SQL query: select path,name from file,path where jobid=94146 and 
>>>> fileindex=2 and file.pathid=path.pathid;
>>>> +-----------------------------------+--------------------------+
>>>> | path                              | name                     |
>>>> +-----------------------------------+--------------------------+
>>>> | /var/backup/old/2024-02-07_23:30/ | backup_www_1100CC.tar.gz |
>>>> +-----------------------------------+--------------------------+
>>>>
>>>> A full restore of the original non-copy backup jobs works fine, as do 
>>>> restores of other copy jobs on tape of filesets that don't use compression.
>>>> The original job that was copied (93682) doesn't exist anymore; I think 
>>>> it was an always-incremental consolidated incremental backup which got 
>>>> consolidated again the following day.
>>>>
>>>> Best regards,
>>>> Samuel
>>>> -- 
>>>> You received this message because you are subscribed to the Google 
>>>> Groups "bareos-users" group.
>>>> To unsubscribe from this group and stop receiving emails from it, send 
>>>> an email to [email protected].
>>>> To view this discussion on the web visit 
>>>> https://groups.google.com/d/msgid/bareos-users/e766e8c6-5b58-424f-816f-f7c5d7252180n%40googlegroups.com
>>>>  
>>>> <https://groups.google.com/d/msgid/bareos-users/e766e8c6-5b58-424f-816f-f7c5d7252180n%40googlegroups.com?utm_medium=email&utm_source=footer>
>>>> .
>>>>
>>>> -- 
>>>>  Sebastian Sura                  [email protected]
>>>>  Bareos GmbH & Co. KG            Phone: +49 221 630693-0
>>>>  https://www.bareos.com
>>>>  Sitz der Gesellschaft: Köln | Amtsgericht Köln: HRA 29646
>>>>  Komplementär: Bareos Verwaltungs-GmbH
>>>>  Geschäftsführer: Stephan Dühr, Jörg Steffens, Philipp Storz
>>>>
>>>> -- 
>>>> You received this message because you are subscribed to the Google 
>>>> Groups "bareos-users" group.
>>>> To unsubscribe from this group and stop receiving emails from it, send 
>>>> an email to [email protected].
>>>>
>>>> To view this discussion on the web visit 
>>>> https://groups.google.com/d/msgid/bareos-users/111cecfe-d3e1-4eb9-8a0e-f053a2777d0f%40bareos.com
>>>>  
>>>> <https://groups.google.com/d/msgid/bareos-users/111cecfe-d3e1-4eb9-8a0e-f053a2777d0f%40bareos.com?utm_medium=email&utm_source=footer>
>>>> .
>>>>
>>>> -- 
>>>>  Sebastian Sura                  [email protected]
>>>>  Bareos GmbH & Co. KG            Phone: +49 221 630693-0
>>>>  https://www.bareos.com
>>>>  Sitz der Gesellschaft: Köln | Amtsgericht Köln: HRA 29646
>>>>  Komplementär: Bareos Verwaltungs-GmbH
>>>>  Geschäftsführer: Stephan Dühr, Jörg Steffens, Philipp Storz
>>>>
>>>> -- 
>> You received this message because you are subscribed to the Google Groups 
>> "bareos-users" group.
>> To unsubscribe from this group and stop receiving emails from it, send an 
>> email to [email protected].
>>
>> To view this discussion on the web visit 
>> https://groups.google.com/d/msgid/bareos-users/3823a22d-b49e-4892-8b45-6c932ce7e699n%40googlegroups.com
>>  
>> <https://groups.google.com/d/msgid/bareos-users/3823a22d-b49e-4892-8b45-6c932ce7e699n%40googlegroups.com?utm_medium=email&utm_source=footer>
>> .
>>
>> -- 
>>  Sebastian Sura                  [email protected]
>>  Bareos GmbH & Co. KG            Phone: +49 221 630693-0
>>  https://www.bareos.com
>>  Sitz der Gesellschaft: Köln | Amtsgericht Köln: HRA 29646
>>  Komplementär: Bareos Verwaltungs-GmbH
>>  Geschäftsführer: Stephan Dühr, Jörg Steffens, Philipp Storz
>>
>>

-- 
You received this message because you are subscribed to the Google Groups 
"bareos-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/bareos-users/67455f46-6182-4fb6-9be1-e8969ae0fb3dn%40googlegroups.com.

Reply via email to