D11239: debugcommands: introduce a debug command to fix repos affected by issue6528

Raphaël Gomès Mon, 02 Aug 2021 06:37:37 -0700

Alphare created this revision.
Herald added a reviewer: hg-reviewers.
Herald added a subscriber: mercurial-patches.


REVISION SUMMARY
  This command is quite basic and slow, it will loop over the entirety of the
  filelogs in the repository and check each revision for corruption, then fixes
  the affected filelogs. It takes under 25 minutes for Mozilla-Central on my
  not-top-of-the-line laptop, using the `--to-report` and `--from-report` 
options
  will make this pretty tolerable to use, I think.
  
  This change also introduces a test for the fix.

REPOSITORY
  rHG Mercurial

BRANCH
  stable

REVISION DETAIL
  https://phab.mercurial-scm.org/D11239

AFFECTED FILES
  mercurial/debugcommands.py
  mercurial/revlogutils/rewrite.py
  tests/bundles/issue6528.tar
  tests/test-completion.t
  tests/test-help.t
  tests/test-issue6528.t

CHANGE DETAILS

diff --git a/tests/test-issue6528.t b/tests/test-issue6528.t
--- a/tests/test-issue6528.t
+++ b/tests/test-issue6528.t
@@ -166,3 +166,135 @@
      date:        Thu Jan 01 00:00:00 1970 +0000
      summary:     c_base_c - create a.txt
   
+
+Test the command that fixes the issue
+-------------------------------------
+
+Restore a broken repository
+It's a tarball because unbundle might magically fix the issue later
+
+  $ cd ..
+  $ mkdir repo-to-fix
+  $ cd repo-to-fix
+  $ tar xf $TESTDIR/bundles/issue6528.tar
+
+Check that the issue is present
+  $ hg st
+  M b.txt
+
+Dry-run the fix
+  $ hg debug-fix-issue6528 --dry-run
+  found corrupted revision 1 for filelog 'data/b.txt.i'
+  $ hg st
+  M b.txt
+  $ hg debugrevlogindex b.txt
+     rev linkrev nodeid       p1           p2
+       0       2 05b806ebe5ea 000000000000 000000000000
+       1       3 a58b36ad6b65 05b806ebe5ea 000000000000
+
+Run the fix
+  $ hg debug-fix-issue6528
+  found corrupted revision 1 for filelog 'data/b.txt.i'
+
+Check that the fix worked and that running it twice does nothing
+  $ hg st
+  $ hg debug-fix-issue6528
+  No corrupted revisions were found
+  $ hg st
+  $ hg debugrevlogindex b.txt
+     rev linkrev nodeid       p1           p2
+       0       2 05b806ebe5ea 000000000000 000000000000
+       1       3 a58b36ad6b65 000000000000 05b806ebe5ea
+
+Try the using the report options
+--------------------------------
+
+  $ cd ..
+  $ mkdir repo-to-fix-report
+  $ cd repo-to-fix
+  $ tar xf $TESTDIR/bundles/issue6528.tar
+
+  $ hg debug-fix-issue6528 --to-report $TESTTMP/report.txt
+  found corrupted revision 1 for filelog 'data/b.txt.i'
+  $ cat $TESTTMP/report.txt
+  data/b.txt.i a58b36ad6b6545195952793099613c2116f3563b
+
+  $ hg debug-fix-issue6528 --from-report $TESTTMP/report.txt --dry-run
+  Loading report file '$TESTTMP/report.txt'
+  found corrupted revision 1 for filelog 'data/b.txt.i'
+  $ hg st
+  M b.txt
+  $ hg debugrevlogindex b.txt
+     rev linkrev nodeid       p1           p2
+       0       2 05b806ebe5ea 000000000000 000000000000
+       1       3 a58b36ad6b65 05b806ebe5ea 000000000000
+
+  $ hg debug-fix-issue6528 --from-report $TESTTMP/report.txt
+  Loading report file '$TESTTMP/report.txt'
+  found corrupted revision 1 for filelog 'data/b.txt.i'
+  $ hg st
+  $ hg debugrevlogindex b.txt
+     rev linkrev nodeid       p1           p2
+       0       2 05b806ebe5ea 000000000000 000000000000
+       1       3 a58b36ad6b65 000000000000 05b806ebe5ea
+
+Check that the revision is not "fixed" again
+
+  $ hg debug-fix-issue6528 --from-report $TESTTMP/report.txt
+  Loading report file '$TESTTMP/report.txt'
+  Revision a58b36ad6b6545195952793099613c2116f3563b of file 'data/b.txt.i' is 
not corrupted
+  No corrupted revisions were found for 'data/b.txt.i'
+  $ hg st
+  $ hg debugrevlogindex b.txt
+     rev linkrev nodeid       p1           p2
+       0       2 05b806ebe5ea 000000000000 000000000000
+       1       3 a58b36ad6b65 000000000000 05b806ebe5ea
+
+Try it with a non-inline revlog
+-------------------------------
+
+  $ cd ..
+  $ mkdir $TESTTMP/ext
+  $ cat << EOF > $TESTTMP/ext/small_inline.py
+  > from mercurial import revlog
+  > revlog._maxinline = 8
+  > EOF
+
+  $ cat << EOF >> $HGRCPATH
+  > [extensions]
+  > small_inline=$TESTTMP/ext/small_inline.py
+  > EOF
+
+  $ mkdir repo-to-fix-not-inline
+  $ cd repo-to-fix-not-inline
+  $ tar xf $TESTDIR/bundles/issue6528.tar
+  $ echo b >> b.txt
+  $ hg commit -qm "inline -> separate"
+  $ find .hg -name *b.txt.d
+  .hg/store/data/b.txt.d
+
+Status is correct, but the problem is still there, in the earlier revision
+  $ hg st
+  $ hg up 3
+  1 files updated, 0 files merged, 0 files removed, 0 files unresolved
+  $ hg st
+  M b.txt
+  $ hg debugrevlogindex b.txt
+     rev linkrev nodeid       p1           p2
+       0       2 05b806ebe5ea 000000000000 000000000000
+       1       3 a58b36ad6b65 05b806ebe5ea 000000000000
+       2       4 468d0263ae92 a58b36ad6b65 000000000000
+
+Run the fix on the non-inline revlog
+  $ hg debug-fix-issue6528
+  found corrupted revision 1 for filelog 'data/b.txt.i'
+
+Check that it worked
+  $ hg debugrevlogindex b.txt
+     rev linkrev nodeid       p1           p2
+       0       2 05b806ebe5ea 000000000000 000000000000
+       1       3 a58b36ad6b65 000000000000 05b806ebe5ea
+       2       4 468d0263ae92 a58b36ad6b65 000000000000
+  $ hg debug-fix-issue6528
+  No corrupted revisions were found
+  $ hg st
diff --git a/tests/test-help.t b/tests/test-help.t
--- a/tests/test-help.t
+++ b/tests/test-help.t
@@ -975,6 +975,9 @@
   $ hg help debug
   debug commands (internal and unsupported):
   
+   debug-fix-issue6528
+                 find corrupted revisions and fix them. See issue6528 for more
+                 details.
    debugancestor
                  find the ancestor revision of two revisions in a given index
    debugantivirusrunning
diff --git a/tests/test-completion.t b/tests/test-completion.t
--- a/tests/test-completion.t
+++ b/tests/test-completion.t
@@ -74,6 +74,7 @@
 
 Show debug commands if there are no other candidates
   $ hg debugcomplete debug
+  debug-fix-issue6528
   debugancestor
   debugantivirusrunning
   debugapplystreamclonebundle
@@ -266,6 +267,7 @@
   config: untrusted, exp-all-known, edit, local, source, shared, non-shared, 
global, template
   continue: dry-run
   copy: forget, after, at-rev, force, include, exclude, dry-run
+  debug-fix-issue6528: to-report, from-report, dry-run
   debugancestor: 
   debugantivirusrunning: 
   debugapplystreamclonebundle: 
diff --git a/tests/bundles/issue6528.tar b/tests/bundles/issue6528.tar
new file mode 100644
index 
0000000000000000000000000000000000000000..f41b94a813beeb987281c9ff72f6bbe53dad0618
GIT binary patch
literal 112640
zc%1E>50DhqoyWW9|BOIWBCP0LXhLza%I^06*#%KR1`b_zVFfQ7=-NH=<`2wF58XZM
z!d)ez<}~Hi36+G96!G8rC+8)(sAw);<wTQ<5&tBfdj7xCoPTmkB~%p4O3vb5chAnw
z%<eEVyEBdJ{ys%-cfaY^zc=sK-`DToJUY9GmcwAeqG;~fu*j6|M<7^%#essvaRR}x
zEW=3zEw(;s1d-PbS)*w}PUn)chN@my>+>l}XIE!3(c)*I0so5<Bboje84>*N1h)Sf
zCYqG936xe7ovEhVFdZxjLUsR#S<C+n!*jO(`7qBAG}H8yjKehdKS7*4NyqY;%x=2)
z2+)Q~=QK2gvId>kQ&}94M6{ZY(HhFBbXwl6=8e;xY@jLF`j5)dBx>1Ppx*MIVHuIP
z^v{Vr2j#yLl=@#&q8(W|gY=FewwX3+mj6t+!vBIKh*17JK@7#^d>Tsg=}M*k@oT@?
z@~^Fbp|boJL{<X*JHd&&?@pTg@QRIJG57NJ^t0yvj;Akp{dA@aaE4O-8*)PDT41B5
z{zbk*|GW_9LH|y`(=r>4a!M>JGqS8Of-EV#%<^oH*u(HbkBAhxJC4G<5<1;U1zNxu
z{dWwd^puir=7Zk2{zaC9_rD{6{--x%^{*ps2*sLdqjCKUVTlF(I|1l_dQ+<ZWWp=~
z;#oC@^k((|)Y3mYru{E+4D|mw!btnSl9B#^N=EtzDuMojsY$8+(U6LUjgELq(~S($
z5~zb=nqi~%_;02DC0>B~zf-tP?cRRhk1pN0Y|i|x?)6taa?5+mPgcQwx>E}@m2vbR
zLpdX9g`7V9b5a=e?-1kapDcXns@Qu@Ljr-;F^>MRAe1!HRn2HpAyBja&j}U!5A!ho
z?-=9iABWS`B2<<QmFK|JVXXcKZT)NJWWd28u2bKYP8pMW?oX+I)|~fQ@&4x+jtBia
z!sVNKv3KdXs%2z@=84$^NfIk@eh@?eZ^4pFeMI@|pCoD%AUZGGKa=`|2!#HZpxUWt
zW;`?N;~D80{xg1x0{{R300000pjGVM6!5prol9QtwVF2>WXR?gyksD<YIB6%ylQoC
zgf>H#=(sJEk%y3`r_?OHzHf76U1Wgn>)%ZGZRzb@8ltOVq|9GUjhr<W(HwLwoUQKd
zUv2J3B36yp{+E7V+oCoSrLr+JY&Y0O4+Dj7w!Uvo<V$qPr*waxeM&FE$z4jPV#DoG
zpTDhZ0lC*}U7_c5NV6`{2MYm;ed;f}>6q)xTNkgTmzam91-v(n40`=-7hoe~`8_L4
z_$o{&r!<f&7|59hZukqgziq(+@}_g`F07FYiwB-EZx@4?q*J<qZ_tR(ENuG?`FbRt
zX>J1XlnL5XCS-5p!1@iV2DZ|dMYfLQ(Nel1(`(&hZT~=I{klH$9%CFv4@A~R1|ofz
zL@u`<$<ZgW!P}NSXs5WXb=P**<7<m7APM`)vpP5i<-DO<=Yylq99ul}?p;Xw+uGX5
z>lTi7k63?PWzSU6Trscf%IroTzEZ~&zgv5|%h$G|t=iYJivZTznZtNy`<|F-9#GE_
z)E}u6)N$&4>TT)`>SY`N0000000000006*e7jM8#l5UsV?+ymMKJ(b^@w&-?+vCB1
zU2eBK7{nh30v^&$23_W_dIHo@g8GR1PwIEnd(<DPH>so4%cuKm00000000000000m
z9)1sLHv9u#PcT6G?K8K}=OF{6+iUGy9{gDlpLqiK#Ooo0L6<%L{{TUKLLJAg|2L=?
zs3)m^#{mEU00000000000Gvib0bh``n)+^c(C-W2!=T&c3i@zxz-;ro{9YUy2nO5%
zj};blx%_T3C=hV@T)tp15DfZAAHE_Gu*d%&Gw1(*K)psCr9Pzohk70d0000000000
z0002+*&*N!+T;EKZ@@b6`@LqP-|h3`0CVi$K6iQSbC=ua4F=8ee{Ucd2-@xcCkg6L
z)Cp_;|C`h+)N?oh0000000000004l`9wXEL3wysk@xNmJzrAx?KM`<wJjM3^Q)c`B
z_tbBw6Vz|1=c!k5000000000000000pG68k`48Ymevot(RseAMtj50A4)j<*{dZf-
z0O0n2V5I#I>;C}&0000000000z|>&k>;I9|+XVgx0000000000006)g<{Z*Pl7icP
z$I}<Qe*1ktx^(BVIrF!=*I)I>E$=Pw?CMM=x}tJ4iMr5`iiV9rRne2Op48-Q0u2sv
zo%*hH%0QE_&tN1$F!yX&WJ>oK(fmWO0*gZ#k!K`=;TVAx37TnE2Aqn#Zpa!<6LLD2
zl(p(#v)1QRl*>2uV&~FvRm;c*%@eZ;k|b8*{2&N#VH?4XH&4kqS;y_YMCWDuXHuUK
zo*8cvvmT}%oi&IL000000000009wFxf9vtL&7Dhrb(<k8X_P^RY;N951|q9AN9fI~
zR`*6|Gh~U5o4^@)2x)ps&C=`pHb>S)2I#*2&2-<E-rl7lx*A3*r=g*dv&JH3)fg(A
zt?un#ZSL!cm{o8aJtJ#7k=B?+98FK{nSezcaVOeMSBY3PUi%ieuWiv{a#t!FL&J8D
zf^}Hh$KWaWVtwD5$d~Ani|GD7y7a;&IGdN!sn~G)bw1MHws<jleZS?^(u+!uU1eUb
zyp@&01~P91oPdc&EhM3loF8H1!1@iV2DZ|dMYfLSLwnfeYrAYQNm%)1chn5Zc|*0%
z%X@Oz(lfIyA29V{|NoEe1oaN}6m>s!fchJ1kc!{{0000000000005YdoD;AnQMktY
z>}<a$7!byd<+G<=jQ81muZIi><A3BaKN{bb+Y=0qYwK*6Coq00=93;WIR5Qm{RaR5
z0000000000m`Y4+{Re3O0{{R300000006*LVlwMLK11Mt00000000000000?1z6Hg
zl0qO*Wv$2YqLi-xsAzIFn&di$QhG{BH*cxKQ?LJsPk5<({YO?5xG=2$=matwjdDsX
zDl@XIFoG;8yv*`!kJ!WTLXU_PxjT-+yh8J#x-V7f7>djJv=KVp-wvq7DfHiwOvPd-
zJ9!(8>tB*22K4U)p#SO3c>33oHiTlG&658_^e?gk=-&xI|I?f5`fr|%3Do~Ni39yR
z0qB2vGoJoanM~eVSFuSpCZPYY$b$Zz0Q5h-8BhO;tT**JZ(RREn1TAgBY^&=H)H?*
zrw&!ot@93sX^xGW`p06geE$oee<!JXya@PwVe3Dc(4wugQA7VM)_aBiIi831zauQ7
z(XgD!r4g;6oT{e`RohKR)oeVKpqJ<f(dQ)-bP}a=g~0R7|0<5os2ZYE*|^#s+Ln`z
zq<&?n@XDuG&|OJ2gSra&)pd=k?bLH}6m{j0rmI;w-Ib6vMNXiul&<HIBy!!j#u?q2
z&xJyZX!CBgoKE2?s7@Oytr|(B(b2S;#pi0AHj*jbN-8!_ZyD&tff^l8rIEf{H&Dik
zujBiSm-bHU!!h4r5x%^T`enJip3G}${4I1nxvZ;eM@&U43pv!8MFv)i6{dIKo9x69
zGrO&$;xhe>M6|xuXhb{T%!>+FTAznP+w!R^afGa#LA0EW(I_LQ()5zF8kN&#78SF^
z_TExj%^4{(quZC#s58-t<K}gwS@BoUm&(}Y8WqtOmL)N&W-fy2+VsON|F5wtfGx6N
z7V=?HtXlsY^Z$RcT$p1CT5M5z%9!Tmf0_UB7TuZB(J*RiZNTzB%QoPDk!5&z|2x50
z|EE*gogFw&63A$hjau!$vHq7B=>K(w@%`Ui8<zgV;RgI44)YR0b1ixv%K-n6`adIQ
zQ*opl@q9XMpkc!boxF|O{XgMK|8pGl|2V=6o{ir;iw->elb`p#HCsEf_m$L(hjDP1
z(rFAEge-O|yd=jYMG^(k?CKS{a1YP)^hhip<ycmV^P<En&~lin*!s6eep+Rt=KG%=
zQ~#ID@}Fr{f=Zd@`iGI4>5Z*_L(QdH*8$vk{ht$A(7z+t{*R?JoDD{czFkfKhbzbb
zd5(wrpCi;e?w2Kqy?5dL?2{P0_u3UiV2WewKc7ozatztSFr7V#)UeEl<IU<1wj|Cl
zHOqfCEQpfb|0m&(q5szbPU(LQ?aHS#q)*aD&GtXXSNK0H2=MzaM~JBg%0|@~3dOTl
zcTXsRvPhHDF_bpskcNiRYN8{mW^(vB&{Jx*V~=jc3gJU+NYBZdj_lKrZm1f9ro>d@
zl=`>)Ir+DMRsPp8{#T-Z#;pHC`QLi#@PAA;<VigQ_4?nc|A%?2|DP4X|4vY^|7Ekf
z-<g__jau)2v$1FSpW!)6{~XJ5F#pRL+yq%TGM56KF%o>TK=-kS$p850wPOq3y8qyN
z_smr9zb0{OY4kiB;zE2#2#FynRJ;vAP<We#H?pu<H+<upXDxl{(3W}J%vXd@pKN>Q
z|8_jUmEA0^s+G3`T1vh8S6Zllfe(+-KM(bPXBgH0CcJGd9J@yJ@BLTb2k*T9Zx8%)
z@}{ppvvBsUOE#VVlhWvUG89#FyO+_{3&J|aje%t}-;JV*$goi{F2ocRW#fod*q&|_
zmL(=8#TERHm%=?LM21F75JlEP#iS5KAKnIv$t3JD<nEiLAM8Yt`CA8eGvAgkTsZH;
zD=Isz%yMTX-ne3FVzC~P6}V_D+!GZ=R^mAZ2k}BY7VTlfNQ_4%S>brjw7wc|Yw+eV
z3!^4x^k<Vg^j~NgPTWS#{$HM}&_Ba7u>O-Hlosxx(wg4q|Ir7Cy7Z5qdmU-2YD}~r
zu%`Y;+J7bQe+KIRj!=jH8D?Z&Sc~d^;WGb+?e-tk|D0j8{>R|0*v~Uk|7$<$p1<<I
z57*uoJ98`f@tw2zeS=>tjh?q&7ew9Fy`rU2du2_;1<!u6=J}^qoY*6-`Oy1?y(`w-
zll|T}_8O_s5yaJH_FRR{vbTKEj$^U!T>Sf0CvW}xb1#}PW9_rjmNI+xI42`2-He13
zBqCW+6i!y8?(Xg$J{mL2svg93i!2W8vF+_Ev*#{srcJWSe7j0}j!nF)OaIZNoK2v#
znwVGtP(%M%>|^?WI05E=I|EtN{{~{T{a4igiM#Jky1EWtLtHUW`10|&um0h?yN;Dc
zU-D4br}W0fF2d!Veb&Cc^EVD$z4+dajR)7B+0NZQ_wNp#x#IBfuG`$(R{cPJi+FI&
zio@Mo=Rb60!%g=d{0I5QU+MRFZ+ZK+8*baNMt%9O*wJJV#N~wD_A4tK!Yej@Wo_c6
zH_p9d-J56Kc?SLd@gu(~O-!DRM^U_65Mq2(;bU<zEc2ou#Cb+!Q8&xS1VNT0BzK1l
z{5A_QG!wF$t)b{(MvbN7y9c9m2W=nPEn4fow_AVB2~ugIqWqAzr=R_F?jQc^+%;Ff
z<$1L87l*b#_NOuVlYY@fJRX_TwlCuQmovBV%1v|H`{K91*njaA$1eEA*M9ce()~|=
z|H_T49$(To@38+o&HF8C-}g7awsbFg_^%)T@eKDHbH01{@ZkB}@k<`+-m~JuSw|Nw
zTlt8a=>OW!(dKv0w_HYA?c-t-zO2ZpHN_(z6Bt2ctOAu`x)n+3i3&m#;esv9MVW3k
z)+6=AL`4|QHKmwqNK2r>=rTI0W-=**w$AH*_JGrlUH&&|psxyzkN<I!0Kflpgi`-c
zdZ4cgjr*S$MGpM$1Y`U^nSs74)I0t+>VF|DLH*AeVEk`dGe-ZD8R)A*z2kpn`iK61
zXMpiP82@WlM{gDC?f)&+zX-qoaR%uBh5p}Wb@W!DX8$iMRsQ}1pMd@yVUqp7!-xL-
z$){#syE5QCvV6<0ul2p$zgm2E{mZWHvks`g+5W_Z^oF0_+#kE`$gv$ii@bEhKOKIT
zJ1~FC@4hH6Jal5-rtRUOs~+<|@%8V$eep#X@1M7{m<+oH&~X3n@S(*|E)QMH%piWf
z;@)3+zj5HMhkm^OAk}@-_47aa__{20)yxxTKfU&m{gG`qoTG0{y|`|1^uT{#>APz%
z=sn|?hput`cF&=QpLp>6zE9>nx`%vb@a%6`ceUF)gTw@Se{q72vSxx-K4&Fp$E$bV
zyI<IF<c`pHR;gVtyz}9W>(D(<pWpMt-h1Yq_u!VcZ--L*X!hoTzTkHC#%sypkDj>W
zqwCV&`oinRoIg80`1U_4%Rk6=4c+`@<-G;3Bp1GB+4!`WU#9b*=ND*%((-@O6TPZX
ztNu4K|Er|_#|lvYbA+-fUuBcQPIt-xCn(i_J{waz?KuF}imJ`BQB(h8>VJ~CBrxdT
z0dNbcSnM;P`DsU`{##+ArvAg?nDIXz*8g;bvN1oVtOkHj%X$RO)6xzCC@ww+jCuc0
zdbamiYU+RN`X6Bi-v5qJ>i-tF-lAUr7w3PnLYM>pJA&>1CN1<*h1&1`%JpCHa{%MN
zj!@V7FR;$b=a$Z{)-U8%Q~zA$`j2c_g7V)HN@rXZr(Xa700000000000000000000
M0000005GNae{ufyI{*Lx

literal 0
Hc$@<O00001

diff --git a/mercurial/revlogutils/rewrite.py b/mercurial/revlogutils/rewrite.py
--- a/mercurial/revlogutils/rewrite.py
+++ b/mercurial/revlogutils/rewrite.py
@@ -7,6 +7,7 @@
 # This software may be used and distributed according to the terms of the
 # GNU General Public License version 2 or any later version.
 
+import binascii
 import contextlib
 import os
 
@@ -472,3 +473,195 @@
     new_index_file.write(entry_bin)
     revlog._docket.index_end = new_index_file.tell()
     revlog._docket.data_end = new_data_file.tell()
+
+
+def _filelog_from_path(repo, path):
+    """Returns the filelog for the given `path`. Stolen from `engine.py`"""
+
+    from .. import filelog  # avoid cycle
+
+    # Drop the extension and the `data/` prefix
+    path_part = path.rsplit(b'.', 1)[0].split(b'/', 1)
+    if len(path_part) < 2:
+        msg = _(b"cannot recognize filelog from filename: '%s'")
+        msg %= path
+        raise error.Abort(msg)
+    path = path_part[1]
+    fl = filelog.filelog(repo.svfs, path)
+    rl = fl._revlog
+    assert rl._format_version == constants.REVLOGV1, rl._format_version
+    return fl
+
+
+def _reorder_filelog_parents(fl, to_fix):
+    """
+    Swaps p1 and p2 for all `to_fix` revisions of filelog `fl` and writes the 
new
+    version to disk, overwriting the old one with a rename.
+    """
+    from ..pure import parsers
+
+    assert len(to_fix) > 0
+    rl = fl._revlog
+    index_format = parsers.IndexObject.index_format
+
+    new_file_path = rl._indexfile + b'tmp-parents-fix'
+    util.copyfile(
+        rl.opener.join(rl._indexfile),
+        rl.opener.join(new_file_path),
+        checkambig=rl._checkambig,
+    )
+
+    with rl.opener(new_file_path, mode=b"r+") as fp:
+        if rl._inline:
+            index = parsers.InlinedIndexObject(fp.read())
+            for rev in fl.revs():
+                entry = rl.index[rev]
+                new_entry = list(entry)
+                if rev in to_fix:
+                    # Swap p1 and p2
+                    new_entry[5], new_entry[6] = entry[6], entry[5]
+                    packed = index_format.pack(*new_entry[:8])
+                    offset = index._calculate_index(rev)
+                    fp.seek(offset)
+                    fp.write(packed)
+        else:
+            for rev in to_fix:
+                entry = rl.index[rev]
+                new_entry = list(entry)
+                # Swap p1 and p2
+                new_entry[5], new_entry[6] = entry[6], entry[5]
+                packed = index_format.pack(*new_entry[:8])
+                offset = rev * index_format.size
+                fp.seek(offset)
+                fp.write(packed)
+
+    rl.opener.rename(new_file_path, rl._indexfile)
+    rl.clearcaches()
+    rl._loadindex()
+
+
+def _is_revision_corrupted(ui, fl, filerev, path):
+    try:
+        raw_text = fl.rawdata(filerev)
+    except error.CensoredNodeError:
+        # We don't care about censored nodes as they never carry metadata
+        return False
+    has_meta = raw_text.startswith(b'\x01\n')
+    if has_meta:
+        (p1, p2) = fl.parentrevs(filerev)
+        if p1 != nullrev and p2 == nullrev:
+            msg = b"found corrupted revision %d for filelog '%s'\n"
+            ui.warn(msg % (filerev, path))
+            return True
+    return False
+
+
+def _from_report(ui, repo, context, from_report, dry_run):
+    """
+    Fix the revisions given in the `from_report` file, but still checks if the
+    revisions are indeed corrupted to prevent an unfortunate cyclic situation.
+
+    See the doc for `debug_fix_issue6528` for the format documentation.
+    """
+    from .. import store  # prevent cycle
+
+    ui.write(_(b"Loading report file '%s'\n") % from_report)
+
+    with context(), open(from_report, mode='rb') as f:
+        for line in f.read().split(b'\n'):
+            if not line:
+                continue
+            encoded, filenodes = line.split(b' ', 1)
+            filename = store.decodefilename(encoded)
+            fl = _filelog_from_path(repo, filename)
+            to_fix = set(
+                fl.rev(binascii.unhexlify(n)) for n in filenodes.split()
+            )
+            excluded = set()
+
+            for filerev in to_fix:
+                if not _is_revision_corrupted(ui, fl, filerev, filename):
+                    msg = _(b"Revision %s of file '%s' is not corrupted\n")
+                    ui.warn(
+                        msg % (binascii.hexlify(fl.node(filerev)), filename)
+                    )
+                    excluded.add(filerev)
+
+            to_fix = to_fix - excluded
+            if len(to_fix) == 0:
+                msg = _(b"No corrupted revisions were found for '%s'\n")
+                ui.write(msg % filename)
+                continue
+            if not dry_run:
+                _reorder_filelog_parents(fl, to_fix)
+
+
+def fix_issue6528(ui, repo, dry_run=False, to_report=None, from_report=None):
+    from .. import store  # avoid cycle
+
+    @contextlib.contextmanager
+    def context():
+        if dry_run or to_report:  # No need for locking
+            yield
+        else:
+            with repo.wlock(), repo.lock(), ui.uninterruptible():
+                yield
+
+    if from_report:
+        return _from_report(ui, repo, context, from_report, dry_run)
+
+    report_entries = []
+
+    with context():
+        total = sum(
+            1
+            for (t, p, _e, _s) in repo.store.datafiles()
+            if p.endswith(b'.i') and t & store.FILEFLAGS_FILELOG
+        )
+
+        progress = ui.makeprogress(
+            _(b"Looking for corrupted revisions"),
+            unit=_(b"filelogs"),
+            total=total,
+        )
+        found_nothing = True
+
+        for file_type, path, encoded, _size in repo.store.datafiles():
+            if (
+                not path.endswith(b'.i')
+                or not file_type & store.FILEFLAGS_FILELOG
+            ):
+                continue
+            progress.increment()
+            fl = _filelog_from_path(repo, path)
+
+            # Set of filerevs (or hex filenodes if `to_report`) that need 
fixing
+            to_fix = set()
+            for filerev in fl.revs():
+                # TODO speed up by looking at the start of the delta
+                # If it hasn't changed, it's not worth looking at the other 
revs
+                # in the same chain
+                corrupted = _is_revision_corrupted(ui, fl, filerev, path)
+                if corrupted:
+                    found_nothing = False
+                    if not dry_run:
+                        if to_report:
+                            to_fix.add(binascii.hexlify(fl.node(filerev)))
+                        else:
+                            to_fix.add(filerev)
+
+            if len(to_fix) > 0:
+                if to_report:
+                    report_entries.append((encoded, to_fix))
+                else:
+                    _reorder_filelog_parents(fl, to_fix)
+
+        if found_nothing:
+            ui.write(_(b"No corrupted revisions were found\n"))
+
+        if to_report and len(report_entries) > 0:
+            with open(to_report, mode="wb") as f:
+                for path, to_fix in report_entries:
+                    f.write(b"%s %s\n" % (path, b" ".join(to_fix)))
+
+        progress.complete()
diff --git a/mercurial/debugcommands.py b/mercurial/debugcommands.py
--- a/mercurial/debugcommands.py
+++ b/mercurial/debugcommands.py
@@ -105,6 +105,7 @@
 from .revlogutils import (
     deltas as deltautil,
     nodemap,
+    rewrite,
     sidedata,
 )
 
@@ -1451,6 +1452,52 @@
         ui.write(b"%s\n" % f)
 
 
+@command(
+    b"debug-fix-issue6528",
+    [
+        (
+            b'',
+            b'to-report',
+            b'',
+            _(b'build a report of corrupted revisions to this file'),
+            _(b'FILE'),
+        ),
+        (
+            b'',
+            b'from-report',
+            b'',
+            _(b'fix revisions listed in this report file'),
+            _(b'FILE'),
+        ),
+    ]
+    + cmdutil.dryrunopts,
+)
+def debug_fix_issue6528(ui, repo, **opts):
+    """find corrupted revisions and fix them. See issue6528 for more details.
+
+    The `--to-report` and `--from-report` flags allow you to cache and reuse 
the
+    computation of corrupted revisions for a given repository across clones.
+    The report format is line-based (with empty lines ignored):
+        <encoded filelog index file path> <ascii-hex of the corrupted 
revision>...
+
+    Though unlikely, there can be multiple broken revisions per filelog.
+
+    Note that this does *not* mean that this fixes future corrupted revisions, 
that
+    needs a separate fix at the exchange level that hasn't been written yet 
(as of
+    5.9rc0).
+    """
+    cmdutil.check_incompatible_arguments(
+        opts, 'to_report', ['from_report', 'dry_run']
+    )
+    dry_run = bool(opts.get('dry_run'))
+    to_report = opts.get('to_report')
+    from_report = opts.get('from_report')
+
+    rewrite.fix_issue6528(
+        ui, repo, dry_run=dry_run, to_report=to_report, from_report=from_report
+    )
+
+
 @command(b'debugformat', [] + cmdutil.formatteropts)
 def debugformat(ui, repo, **opts):
     """display format information about the current repository



To: Alphare, #hg-reviewers
Cc: mercurial-patches, mercurial-devel
_______________________________________________
Mercurial-devel mailing list
Mercurial-devel@mercurial-scm.org
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel

D11239: debugcommands: introduce a debug command to fix repos affected by issue6528

Reply via email to