On a machine with FreeBSD 13.5, that was originally prepared with GELI encryption and ZFS file system by the FreeBSD 11 installer, the following error message appears during the automatic FreeBSD boot proces, before it stops at the moutroot prompt.
Mounting from zfs:zroot/ROOT/default failed with error 6.
Loader variables:
vfs.root.mountfrom=zfs:zroot/ROOT/default
Manual root filesystem specification:
mountroot>
Trying to mount gives label error?
mountroot> zfs: zroot/ROOT/default ?
Sun Solaris label '' ?
The problem occurred after a second, and never used before, SSD was added to the machine with the plan to provide extra storage to the existing ZFS storage pool. The secondary SSD was prepared with GELI and ZFS. The machine was then rebooted.
If the machine is booted from a FreeBSD USB memory stick, then GELI can attach each SSD, if ZFS imports the boot partition with the GELI encryption keyfile first.
POSIX error code 6? Missing device?
# cat loader.conf
geli_ada0p4_keyfile0_load="YES"
geli_ada0p4_keyfile0_type="ada0p4:geli_keyfile0"
geli_ada0p4_keyfile0_name="/boot/encryption.key"
aesni_load="YES"
geom_eli_load="YES"
geom_eli_passphrase_prompt="YES"
vfs.root.mountfrom="zfs:zroot/ROOT/default"
kern.geom.label.gptid.enable="0"
zpool_cache_load="YES"
zpool_cache_type="/boot/zfs/zpool.cache"
zpool_cache_name="/boot/zfs/zpool.cache"
zfs_load="YES"
pf_load="YES"
pflog_load="YES"
Setting boot flag.
# kldload geom_eli
# kldload zfs
# zpool import -R /mnt bootpool
# geli attach -k /mnt/boot/encryption.key ada0p4
GEOM_ELI: Device ada0p4.eli created.
# geli attach ada1p1
# zpool export bootpool
GEOM_ELI: Device ada1p1.eli created.
# geli info ada0p4.eli
Flags: BOOT
# geli info ada1p1
Flags: AUTORESIZE
# geli configure -b ada1p1
# geli info ada1p1
Flags: BOOT, AUTORESIZE
# geli detach ..
# geli detach ..
# reboot
The solved the issue about the error 6. However, further action was needed, as it stopped at mounting filesystems.
ZFS cache.
Import the ZFS root storage pool. This ensures, that ZFS can see all the devices, and, that the pool is cleanly imported.
# kbdcontrol -l /usr/share/syscons/keymaps/danish.iso.kbd # kldload geom_eli # kldload zfs # mdmfs -s 100m md0 /mnt # zpool import -f -R /mnt bootpool # geli attach -k /mnt/boot/encryption.key ada0p4 GEOM_ELI: Device ada0p4.eli created. # geli attach ada1p1 GEOM_ELI: Device ada1p1.eli created. # zpool import pool: zroot state: ONLINE config: zroot ONLINE ada0p4.eli ONLINE ada1p1.eli ONLINE # zpool import -f -R /mnt zpool # zpool status pool: zroot state: ONLINE errors: No known data errors
Replace the symbolic link with a directory, that can be used for mounting the boot partition. Export and import the ZFS storage pools, but this time, so the root is complete with the boot partition, where the ZFS cache is stored. Create a backup of the old ZFS cache files and remove them. Generate a new ZFS cache file.
# file /mnt/boot /mnt/boot: broken symbolic link to /bootpool/pool # rm /mnt/boot # mkdir -p /mnt/boot # zpool export bootpool # zpool export zroot # zpool import -f -R /mnt zpool # zpool import -f -R /mnt/boot bootpool # find /mnt -type f -name 'zpool.cache' # cp /mnt/boot/boot/zfs/zpool.cache /mnt/boot/boot/zfs/zpool.cache.bak # rm /mnt/boot/boot/zfs/zpool.cache # cp /mnt/etc/zfs/zpool.cache /mnt/etc/zfs/zpool.cache.bak # rm /mnt/etc/zfs/zpool.cache # zpool set cachefile=/mnt/boot/boot/zfs/zpool.cache zroot # zpool set comment="Test" zroot # find /mnt -type f -name 'zpool.cache' # zpool export bootpool # rmdir /mnt/boot # ln -s /bootpool/boot /mnt/boot ? # zpool export zroot # geli detach ada1p1 # geli detach ada0p4 # reboot
This caused bootpool to not get imported. When imported, it mounted as root and bricked.
Boot not mounted.
If /boot is a symbolic link, then delete it.
# ls -ld /boot
/boot -> /bootpool/boot
# zfs get mountpoint bootpool
# zpool set cachefile=/boot/zfs/zpool.cache zroot
# find / -type f -name 'zpool.cache' -delete
X?
Could this be caused, if GELI on the second SSD was configured with same passphrase as primary SSD, but without the requirement for the keyfile component? Could this be caused by the ZFS cache?
Ideas?
- Suspect that the GELI boot flag are required on secondary SSD for the boot loader to decrypt it. The boot flag might be required for the boot loader to decrypt partitions, that are required for mouting the root.
- Suspect that ZFS cache causes the error. Refresh it? Delete it?
- Suspect that incorrect or lack of ZFS labels causes the error. Clear or set?
- Un-merge the secondary SSD from the ZFS storage pool and attempt to revert to previous working state of the machine.
- Suspect that lack of GELI keyfile requirement for secondary SSD causes the error. Enable the use of the GELI keyfile for secondary SSD, so both SSDs can be attached with the same user passphrase and the same keyfile by the FreeBSD automatic boot proces.
- Suspect that lack of GELI keyfile requirement for secondary SSD causes the error. Disable the use of the GELI keyfile for the primary SSD, so both SSDs can be attached with the same user passphrase by the boot loader.
Any ideas?
Conclusion.
Do not add extra storage devices to a FreeBSD machine, that use GELI and ZFS, by simply merging it into the existing pool. You will end up with a bricked system, that is full of catch 22 type critical errors. This is caused by, how the ZFS cache works, and, how the boot partition and root partitions are inter-connected through a symbolic link, that blocks mounting. The way to add extra storage devices is a fresh re-install of the machine. You get the current FreeBSD, which include file system related changes, such as GELI and ZFS changes. You also save yourself a lot of time.
References.
- https://man.freebsd.org/cgi/man.cgi?zpool(8)
- https://forums.freebsd.org/threads/mounting-zfs-failed-with-error-6-during-boot.97894/
- https://www.adyxax.org/blog/2023/01/05/recover-a-freebsd-system-using-a-liveusb/
- https://forums.freebsd.org/threads/zfs-zrrot-root-default-failed-with-error-6-after-fresh-install.91349/