Simen Endsjø <[email protected]> writes:

> Rutherther <[email protected]> writes:
>
>> Hi Simen,
>>
>> Simen Endsjø <[email protected]> writes:
>>
>>> I broke my system real bad!
>>>
>>> I suspect a disk corruption during kexec boot after a reconfigure. I got a 
>>> blank system derivation file.
>>>
>>> Booted into an older generation and ran gc, but it seems like only the 
>>> newest generation was being kept (marked as current for some reason)! So 
>>> now I don't have any correct system generation!
>>
>>>
>>> I can boot into the system, but I cannot reconfigure as the file is blank. 
>>> I tried to remount the store as rw and delete the file, but then I get "no 
>>> such file or directory" as it tries to read the file.
>>
>> Yes, unfortunately Guix expects the files aren't corrupted so it doesnt
>> really recover from such states. The only exception is the guix gc
>> --verify that expects something can be broken, and can fix it in case it
>> is substitutable. Usually should be ran with contents and repair
>> arguments for repairs of file corruption.
>>
>> As for your experiment with removing the file manually, you would also
>> need to remove the record in the database that says the file is there.
>> The database is at `/var/guix/db/db.slite`, it keeps the list of files
>> in the store, their hashes, references etc. But I would suggest against
>> taking this route. You need to remove also all referrers of the path you
>> want to remove, and at that point it's safer to use guix commands rather
>> than touching the database yourself, specifically the guix gc -D. And it
>> could be possible with your corruption when drvs are corrupted. Just one
>> thing, you will need to run guix daemon without the --keep-derivations
>> flag that is added by default, so to stop the system instance managed by
>> shepherd, and start your own one. Then you can check the referrers of
>> file you want to remove with guix gc --referrers, and after you
>> recursively look at all referrers and remove the leaves, you can iterate
>> back to remove the file you originally wanted to remove. Also still it
>> can help to run the guix gc --verify=contents,repair to get list of all
>> files that are corrupted, and possibly repair them. And if they cannot
>> be repaired, you can remove them.
>
> Sounds complicated and dangerous to do without having a good
> understanding of the internals :/
>
> I tried running gc with verify,repair, but I found out it didn't
> actually solve this issue and stopped it.

Removing records from the db is definitely dangerous, but so it touching
the store files! That's why I suggested gc which shouldn't be dangerous for
currently running system as long as you use guix 'properly' - it can be
considered dangerous if you for example hardcode paths to the store
without gc rooting them. With that it will not give you a way to remove
something that shouldn't be removed. But yes, it is kinda complicated as
you have to do everything manually. I have been thinking for some time
to make a tool to make repairing corruptions of the guix store easier.
If someone was open to brainstorming I would definitely appreciate that.

>
>>>
>>> How on earth can I get out of this mess I have created? Can I try to build 
>>> this system on another computer and import it into the bad computer...? Or 
>>> any other way I can reconfigure the system even though I have a bad system? 
>>> Avoiding the checks?
>>
>> Note that what the other person recommended in response to you likely
>> won't work because the drv is still going to be necessary even when 
>> chrooting.
>>
>> As for easiest solution, in terms of number of steps or thinking, should
>> be just to reinitialize your system - remove whole /gnu/store and
>> /var/guix, losing all information about generations of profiles, both
>> system and user. Then you would `guix system init` your config.scm in a
>> live iso, basically following a few steps from the installation guide
>> manual, skipping partitioning and such. You shouldnt lose any data, but
>> still probably better to make a backup if you dont have any.
>>
>> Rutherther
>
> Thanks! I used this path, and it worked great! I built a new
> installation medium with required drivers and recent pull. I followed
> the manual installation guide (without touching the partitions). ran
> init and rebooted.
>
> I did two mistakes though, first I forgot to mount EFI, and the second ,
> I had opened the luks partition with a different name than what I had in
> my configuration, which apparently had the effect of not being able to
> boot. After solving these issues I could boot and also build my home.

Glad that you got it sorted out!

>
> Feels strange to go from system generation 495 to 1 ;)

Rutherther

Reply via email to