Hi Sebastian That's great, thank you! I'll give it a try when I have the opportunity to update the director/SD again.
Best regards Samuel On Monday, March 25, 2024 at 3:46:42 PM UTC+1 Sebastian Sura wrote: > Hi Samuel, > > i think i managed to reproduce your Problem. I created a fix here if you > want to check it out: https://github.com/bareos/bareos/pull/1745 > > Kind Regards > > Sebastian Sura > Am 04.03.24 um 11:43 schrieb 'Samuel' via bareos-users: > > I finally had the chance to restart the storage daemon this morning in > order to disable autoxflate. > Today's copy jobs that ran afterwards finished without any warning, so I > think it's probably an issue with autoxflate (on replication?). I can also > successfully restore those new copies. > > Best regards, > Samuel > On Monday, February 26, 2024 at 12:19:50 PM UTC+1 Samuel wrote: > >> Yes, that's right. In this case though it's a consolidated incremental >> which was copied. However, the warnings also occur during migration jobs of >> consolidated fulls. The warning has never occurred while copying/migrating >> a non-consolidated incremental. >> >> This is all working just fine except for a couple of jobs which all have >> the following in common: filedaemon is on another host than director, they >> use LZ4 compression and autoxflate inflates on replication (though it's >> also enabled for all the other jobs but doesn't have any effect since they >> don't use compression). >> >> I use the SQL Query selection type: >> Selection Type = SQL Query >> Selection Pattern = " >> SELECT DISTINCT Job.JobId, Job.StartTime >> FROM Job, Pool >> WHERE Pool.Name IN ('AI-Consolidated', 'AI-Incremental') AND >> Pool.PoolId = Job.PoolId AND >> Job.Type = 'B' AND >> Job.JobStatus IN ('T', 'W') AND >> Job.Level IN ('I') AND >> Job.JobBytes > 0 AND >> Job.JobId NOT IN ( >> SELECT PriorJobId >> FROM Job AS copy >> WHERE copy.Type IN ('B', 'C', 'A') AND >> copy.JobStatus IN ('T', 'W') AND >> copy.PriorJobId != 0 >> ) >> ORDER by Job.StartTime;" >> >> No, there's no concurrent writing going on currently. Before updating to >> Bareos 23 I did have devices with concurrent writing enabled, but that was >> before these warnings happened. >> >> Best regards, >> Samuel >> >> On Monday, February 26, 2024 at 11:51:59 AM UTC+1 Sebastian Sura wrote: >> >>> Hi Samuel >>> >>> i just wanted to make sure i completely understand the situation. You >>> have an always incremental job with compression >>> turned on (which algorithm do you use ?) and then you have a copy job >>> that copies the consolidated full job (on the same sd) >>> to tape with a device that auto inflates on write. >>> >>> Do i have that right ? >>> >>> Does the device where the job gets consolidated onto allow concurrent >>> writes from multiple jobs ? >>> >>> What selection type has your copy job ? >>> >>> Kind Regards >>> Sebastian Sura >>> Am 26.02.24 um 11:32 schrieb 'Samuel' via bareos-users: >>> >>> Hi again. >>> >>> So, as expected, the warning occurred again when job 94511 was copied >>> (but also two new warnings for this specific job): >>> Warning: dird/catreq.cc:608 MD5 digest not same File=3 as attributes=2 >>> Warning: dird/catreq.cc:608 MD5 digest not same File=18 as attributes=17 >>> Warning: dird/catreq.cc:608 MD5 digest not same File=23 as attributes=22 >>> >>> The restore of the copy also errored in the same manner as before: >>> 2024-02-26 10:43:50 bareos-fd JobId 95784: Error: >>> findlib/attribs.cc:381 File size of restored file >>> /tmp/bareos-restores/var/backup/old/2024-02-07_23:30/backup_www_1100CC.tar.gz >>> >>> not correct. Original 738212473, restored 442105856. >>> 2024-02-26 10:44:10 bareos-fd JobId 95784: Error: >>> findlib/attribs.cc:381 File size of restored file >>> /tmp/bareos-restores/var/backup/old/2024-02-12_23:30/backup_www_1100CC.tar.gz >>> >>> not correct. Original 856739265, restored 783548416. >>> 2024-02-26 10:44:17 bareos-fd JobId 95784: Error: >>> findlib/attribs.cc:381 File size of restored file >>> /tmp/bareos-restores/var/backup/old/2024-02-14_23:30/backup_www_1100CC.tar.gz >>> >>> not correct. Original 856739984, restored 31064064. >>> >>> For now I'll try disabling autoxflate and see if that solves the issue. >>> >>> Best regards, >>> Samuel >>> >>> >>> On Friday, February 23, 2024 at 2:02:28 PM UTC+1 Samuel wrote: >>> >>>> Thanks for looking into it! >>>> >>>> Hm, there seems to be no file with fileindex=3 in the database for the >>>> copy: >>>> Enter SQL query: select * from file where jobid=94146 and fileindex=3; >>>> No results to list. >>>> >>>> All files that are being backed up are just regular files, nothing >>>> special. >>>> >>>> The copy runs locally on the director's host from its SD to the same SD. >>>> >>>> Here's file 2 and 3 of today's consolidated incremental backup which >>>> will be copied tomorrow and most likely lead to the same warning again as >>>> it has for the last few days (for this specific fileset it always warns >>>> about `File=3 attributes=2`): >>>> >>>> Enter SQL query: select *, decode_lstat(lstat) from file where >>>> jobid=94511 and fileindex=2; >>>> >>>> +-------------+-----------+-------+---------+----------+--------+--------+--------+----------------------------------------------------------------+------------------------+--------------------------+----------------------------------------------------------------------------------------------+ >>>> | fileid | fileindex | jobid | pathid | deltaseq | markid | >>>> fhinfo | fhnode | lstat >>>> >>>> | md5 | name | decode_lstat >>>> >>>> >>>> >>>> | >>>> >>>> +-------------+-----------+-------+---------+----------+--------+--------+--------+----------------------------------------------------------------+------------------------+--------------------------+----------------------------------------------------------------------------------------------+ >>>> | 484,575,603 | 2 | 94511 | 910,901 | 0 | 0 | 0 >>>> | 0 | P0D CgA4 IGk B A i A sADp5 BAA FgAo BlxAt1 BlxASF BlxVYH A A d >>>> | D/VouduY5TF4KFacjRE7Hw | backup_www_1100CC.tar.gz | >>>> (64771,655416,33188,1,0,34,0,738212473,4096,1441832,1707346805,1707345029,1707431431,0,0,29) >>>> >>>> | >>>> >>>> +-------------+-----------+-------+---------+----------+--------+--------+--------+----------------------------------------------------------------+------------------------+--------------------------+----------------------------------------------------------------------------------------------+ >>>> >>>> Enter SQL query: select *, decode_lstat(lstat) from file where >>>> jobid=94511 and fileindex=3; >>>> >>>> +-------------+-----------+-------+---------+----------+--------+--------+--------+---------------------------------------------------------------+------------------------+-----------------------------+-------------------------------------------------------------------------------------------+ >>>> | fileid | fileindex | jobid | pathid | deltaseq | markid | >>>> fhinfo | fhnode | lstat >>>> >>>> | md5 | name | decode_lstat >>>> >>>> >>>> | >>>> >>>> +-------------+-----------+-------+---------+----------+--------+--------+--------+---------------------------------------------------------------+------------------------+-----------------------------+-------------------------------------------------------------------------------------------+ >>>> | 484,575,604 | 3 | 94511 | 910,901 | 0 | 0 | 0 >>>> | 0 | P0D CgA5 IGk B A i A BA9gA BAA IHw BlxAt8 BlxASH BlxVYH A A d >>>> | >>>> J9h+OdR/6XCqhqpJdJvYtw | backup_databases_1100CC.tar | >>>> (64771,655417,33188,1,0,34,0,17029120,4096,33264,1707346812,1707345031,1707431431,0,0,29) >>>> >>>> | >>>> >>>> +-------------+-----------+-------+---------+----------+--------+--------+--------+---------------------------------------------------------------+------------------------+-----------------------------+-------------------------------------------------------------------------------------------+ >>>> >>>> >>>> Restore of this job (jobid=94511) works just fine: >>>> >>>> 23-Feb 13:44 bareos-sd JobId 94559: Releasing device >>>> "FileDevice-ReadOnly-0005" (/backup_1/bareos). >>>> 23-Feb 13:44 bareos-dir JobId 94559: Max configured use duration=82,800 >>>> sec. exceeded. Marking Volume "AI-Consolidated-1771" as Used. >>>> 23-Feb 13:44 bareos-dir JobId 94559: Bareos bareos-dir >>>> 23.0.2~pre32.0a0e55739 (31Jan24): >>>> Build OS: Ubuntu 20.04.5 LTS >>>> JobId: 94559 >>>> Job: Restore-Files.2024-02-23_13.43.47_38 >>>> Restore Client: "bareos-fd" 23.0.2~pre32.0a0e55739 (31Jan24) >>>> Ubuntu 20.04.5 LTS,ubuntu >>>> Start time: 23-Feb-2024 13:43:49 >>>> End time: 23-Feb-2024 13:44:20 >>>> Elapsed time: 31 secs >>>> Files Expected: 33 >>>> Files Restored: 33 >>>> Bytes Restored: 9,143,514,060 >>>> Rate: 294952.1 KB/s >>>> FD Errors: 0 >>>> FD termination status: OK >>>> SD termination status: OK >>>> Bareos binary info: Bareos community build (UNSUPPORTED): Get >>>> professional support from https://www.bareos.com >>>> Job triggered by: User >>>> Termination: Restore OK >>>> >>>> I'll also try restoring its copy from tape once it has been copied. >>>> >>>> Best regards, >>>> Samuel >>>> On Friday, February 23, 2024 at 1:16:32 PM UTC+1 Sebastian Sura wrote: >>>> >>>>> I meant to say File 3 has no stream 1, sorry for the confusion! >>>>> Am 23.02.24 um 13:15 schrieb Sebastian Sura: >>>>> >>>>> Hi Samuel >>>>> >>>>> thanks for the gathering this info. Ill look into the bscan issue as >>>>> well though i think this might be a known issue. >>>>> Regardless, regarding your actual issue: The bscan output shows that >>>>> File 3 is missing its attributes (it has no stream 3). >>>>> This confuses the director as it never got told that file 3 now >>>>> started getting backed up and this is why you get the warning >>>>> message from the director. >>>>> >>>>> The same is happening during the restore: the filedaemon never got >>>>> told that now a new file was started (because it never got the attribute >>>>> stream) >>>>> so it basically merged both File 2 and File 3 into one file. Even >>>>> with this in mind, i think the restore should have caught that and issues >>>>> a >>>>> warning. >>>>> Ill look into why this did not happen. >>>>> >>>>> If you add up all the data records (stream=2) with fileid=2 and >>>>> fileid=3, you will get 738212473 bytes, which is exactly what the >>>>> filedaemon reported >>>>> as size for file 2. Since the log contains the size it expected i >>>>> imagine that you could manually restore file 3 by splitting file 2 into >>>>> two. >>>>> >>>>> Can you check which file fileid 3 corresponds to ? Is it a special >>>>> kind of file or just a normal one ? >>>>> >>>>> Ill try to see if i can reproduce your issue in the copy system test. >>>>> Do you do a local copy (so copy to the same sd) or a remote one? >>>>> >>>>> Kind Regards >>>>> >>>>> Sebastian Sura >>>>> Am 23.02.24 um 12:56 schrieb 'Samuel' via bareos-users: >>>>> >>>>> Hi Sebastian, >>>>> >>>>> After looking at the logs some more I noticed that all jobs with this >>>>> warning have in common that they're using compression (LZ4) and >>>>> autoxflate >>>>> on replication. >>>>> Perhaps this combination is still not entirely fixed yet. I'll try >>>>> disabling autoxflate in the coming days. >>>>> >>>>> The entire result of bscan is in the attachment. However here's the >>>>> last part of it where bscan seems to abort (?): >>>>> $ sudo -u bareos bscan -b md5_digest_error_copy_only.bsr >>>>> --list-records TapeDevice2 2>&1 | tee records.txt >>>>> ... >>>>> bscan: stored/bscan.cc:494-0 Record: SessId=3480 SessTim=1707222936 >>>>> FileIndex=23 Stream=2 len=65536 >>>>> bscan: stored/bscan.cc:494-0 Record: SessId=3480 SessTim=1707222936 >>>>> FileIndex=23 Stream=2 len=65536 >>>>> bscan: stored/bscan.cc:494-0 Record: SessId=3480 SessTim=1707222936 >>>>> FileIndex=23 Stream=2 len=65536 >>>>> bscan: stored/bscan.cc:494-0 Record: SessId=3480 SessTim=1707222936 >>>>> FileIndex=23 Stream=2 len=22528 >>>>> bscan: stored/bscan.cc:494-0 Record: SessId=3480 SessTim=1707222936 >>>>> FileIndex=23 Stream=3 len=16 >>>>> bscan: stored/bscan.cc:494-0 Record: SessId=3480 SessTim=1707222936 >>>>> FileIndex=24 Stream=1 len=100 >>>>> bscan: stored/bscan.cc:494-0 Record: SessId=3480 SessTim=1707222936 >>>>> FileIndex=25 Stream=1 len=83 >>>>> bscan: stored/bscan.cc:494-0 Record: SessId=3480 SessTim=1707222936 >>>>> FileIndex=-5 Stream=94145 len=193 >>>>> bscan: stored/bscan.cc:681-0 Could not find SessId=3480 >>>>> SessTime=1707222936 for EOS record. >>>>> Records would have been added or updated in the catalog: >>>>> 0 Media >>>>> 1 Pool >>>>> 0 Job >>>>> 0 File >>>>> 0 RestoreObject >>>>> 23-Feb 12:05 bscan JobId 0: Releasing device "TapeDevice2" >>>>> (/dev/tape/by-id/scsi-35000e111c71ac0bf-nst). >>>>> >>>>> I also tried restoring the copy which ends in error. >>>>> >>>>> Most files are restored successfully except for one file which happens >>>>> to have FileIndex=2 (corresponding to the `attributes=2` in the warning?): >>>>> Enter SQL query: select path,name from file,path where jobid=94146 and >>>>> fileindex=2 and file.pathid=path.pathid; >>>>> +-----------------------------------+--------------------------+ >>>>> | path | name | >>>>> +-----------------------------------+--------------------------+ >>>>> | /var/backup/old/2024-02-07_23:30/ | backup_www_1100CC.tar.gz | >>>>> +-----------------------------------+--------------------------+ >>>>> >>>>> A full restore of the original non-copy backup jobs works fine, as do >>>>> restores of other copy jobs on tape of filesets that don't use >>>>> compression. >>>>> The original job that was copied (93682) doesn't exist anymore; I >>>>> think it was an always-incremental consolidated incremental backup which >>>>> got consolidated again the following day. >>>>> >>>>> Best regards, >>>>> Samuel >>>>> -- >>>>> You received this message because you are subscribed to the Google >>>>> Groups "bareos-users" group. >>>>> To unsubscribe from this group and stop receiving emails from it, send >>>>> an email to [email protected]. >>>>> To view this discussion on the web visit >>>>> https://groups.google.com/d/msgid/bareos-users/e766e8c6-5b58-424f-816f-f7c5d7252180n%40googlegroups.com >>>>> >>>>> <https://groups.google.com/d/msgid/bareos-users/e766e8c6-5b58-424f-816f-f7c5d7252180n%40googlegroups.com?utm_medium=email&utm_source=footer> >>>>> . >>>>> >>>>> -- >>>>> Sebastian Sura [email protected] >>>>> Bareos GmbH & Co. KG Phone: +49 221 630693-0 >>>>> https://www.bareos.com >>>>> Sitz der Gesellschaft: Köln | Amtsgericht Köln: HRA 29646 >>>>> Komplementär: Bareos Verwaltungs-GmbH >>>>> Geschäftsführer: Stephan Dühr, Jörg Steffens, Philipp Storz >>>>> >>>>> -- >>>>> You received this message because you are subscribed to the Google >>>>> Groups "bareos-users" group. >>>>> To unsubscribe from this group and stop receiving emails from it, send >>>>> an email to [email protected]. >>>>> >>>>> To view this discussion on the web visit >>>>> https://groups.google.com/d/msgid/bareos-users/111cecfe-d3e1-4eb9-8a0e-f053a2777d0f%40bareos.com >>>>> >>>>> <https://groups.google.com/d/msgid/bareos-users/111cecfe-d3e1-4eb9-8a0e-f053a2777d0f%40bareos.com?utm_medium=email&utm_source=footer> >>>>> . >>>>> >>>>> -- >>>>> Sebastian Sura [email protected] >>>>> Bareos GmbH & Co. KG Phone: +49 221 630693-0 >>>>> https://www.bareos.com >>>>> Sitz der Gesellschaft: Köln | Amtsgericht Köln: HRA 29646 >>>>> Komplementär: Bareos Verwaltungs-GmbH >>>>> Geschäftsführer: Stephan Dühr, Jörg Steffens, Philipp Storz >>>>> >>>>> -- >>> You received this message because you are subscribed to the Google >>> Groups "bareos-users" group. >>> To unsubscribe from this group and stop receiving emails from it, send >>> an email to [email protected]. >>> >>> To view this discussion on the web visit >>> https://groups.google.com/d/msgid/bareos-users/3823a22d-b49e-4892-8b45-6c932ce7e699n%40googlegroups.com >>> >>> <https://groups.google.com/d/msgid/bareos-users/3823a22d-b49e-4892-8b45-6c932ce7e699n%40googlegroups.com?utm_medium=email&utm_source=footer> >>> . >>> >>> -- >>> Sebastian Sura [email protected] >>> Bareos GmbH & Co. KG Phone: +49 221 630693-0 >>> https://www.bareos.com >>> Sitz der Gesellschaft: Köln | Amtsgericht Köln: HRA 29646 >>> Komplementär: Bareos Verwaltungs-GmbH >>> Geschäftsführer: Stephan Dühr, Jörg Steffens, Philipp Storz >>> >>> -- > You received this message because you are subscribed to the Google Groups > "bareos-users" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to [email protected]. > > To view this discussion on the web visit > https://groups.google.com/d/msgid/bareos-users/67455f46-6182-4fb6-9be1-e8969ae0fb3dn%40googlegroups.com > > <https://groups.google.com/d/msgid/bareos-users/67455f46-6182-4fb6-9be1-e8969ae0fb3dn%40googlegroups.com?utm_medium=email&utm_source=footer> > . > > -- > Sebastian Sura [email protected] > Bareos GmbH & Co. KG Phone: +49 221 630693-0 > https://www.bareos.com > Sitz der Gesellschaft: Köln | Amtsgericht Köln: HRA 29646 > Komplementär: Bareos Verwaltungs-GmbH > Geschäftsführer: Stephan Dühr, Jörg Steffens, Philipp Storz > > -- You received this message because you are subscribed to the Google Groups "bareos-users" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To view this discussion on the web visit https://groups.google.com/d/msgid/bareos-users/00197b2d-e30d-4728-9441-a3fea6d7fde7n%40googlegroups.com.
