Re: smartctl cannot access my storage, need syntax help
On 1/22/24 04:46, David Christensen wrote: On 1/21/24 21:42, gene heskett wrote: On 1/21/24 18:29, David Christensen wrote: On 1/21/24 14:48, gene heskett wrote: On 1/21/24 16:13, David Christensen wrote: On 1/21/24 03:47, gene heskett wrote: On 1/21/24 01:33, David Christensen wrote: 3. For Amanda, either add more HDD's to the storage server or build another server. If another server, shut it down when you are not using it. ... Designed to run every night when things are relatively quiet, how this works well depends on the other machines it is backing up be available. ... 1. Wake on LAN. 2. Wake at preset day/time. ... Unfortunately, amanda is truely ancient, for some reason the originator who first wrote it in the later 70's IIRC sold it nearly 20 years ago to a commercial outfit called zmanda, who took it more or less commercial, throwing the amanda named version under the buss. They must have ran out of money and resold it to another outfit, who has redoubled their effort to get rid of the free version. One of the things it has not been fixed to do, is issue a wakeup call, and wait for the clients to get their stuff in one sock, say 30 seconds to get everything spun up and ready to take orders, so I'm pretty sure a client that doesn't respond in milliseconds will be skipped. So basically, amanda needs to be officially forked since the current owner, Betsol has not made any contribution to amanda that amounts to an actual update but once in 6 or 7 years now, There's several things it now needs, such as the wake-on-lan support done right. Python is part of it but python 2 is still needed. Or a whole new start for something to replace and put it back squarely in the gplv2 or 3 camp. If there is actually another capable of diddling the level schedule like amanda does, I'm sure I could name some of the major users that would jump ship in a week or so once they became aware of a workalike. It appears Amanda has a script API for both the client and the server: https://manpages.debian.org/buster/amanda-common/amanda-scripts.7.en.html The zmanda wiki has a Script API page, but it is empty (?): https://wiki.zmanda.com/index.php/Script_API. For BIOS/UEFI wake-on-lan, it might be possible to write a script that wakes the clients, to write a script that shuts down the clients, to configure the Amanda server to run the wake script before backups, and to configure the Amanda server to run the shutdown script after backups. For BIOS/UEFI wake at preset day/time, it might be possible to set the clients to wake before the scheduled backup time, to write a script that shuts down the clients, and to configure the Amanda server to run the shutdown script after backups. David All this is possible David, but needs someone to do it. So far our list of volunteers is pretty slim. I once wrote a script that added amanda's database to the end of the vtape amanda had just made, making a bare metal recovery to the state it had just reported instead of a bare metal being one run out of date, but I did that in bash. I've never did anything to amanda itself except compile it, its old perl, old python and probably older bash, all dumped into the same bowl and the mixer turned on high. Amanda, right now, needs 15 years of catchup tlc. I haven't even tried to build it since wheezy and I have far newer srcs than the current 3.51 here. While I have the /home/amanda dir with all that it , i'd have to create anew amanda user with passwd-less access to the system to even attempt a build of what I have. . Cheers, Gene Heskett. -- "There are four boxes to be used in defense of liberty: soap, ballot, jury, and ammo. Please use in that order." -Ed Howdershelt (Author, 1940) If we desire respect for the law, we must first make the law respectable. - Louis D. Brandeis
Re: smartctl cannot access my storage, need syntax help
On 1/21/24 21:42, gene heskett wrote: On 1/21/24 18:29, David Christensen wrote: On 1/21/24 14:48, gene heskett wrote: On 1/21/24 16:13, David Christensen wrote: On 1/21/24 03:47, gene heskett wrote: On 1/21/24 01:33, David Christensen wrote: 3. For Amanda, either add more HDD's to the storage server or build another server. If another server, shut it down when you are not using it. ... Designed to run every night when things are relatively quiet, how this works well depends on the other machines it is backing up be available. ... 1. Wake on LAN. 2. Wake at preset day/time. ... Unfortunately, amanda is truely ancient, for some reason the originator who first wrote it in the later 70's IIRC sold it nearly 20 years ago to a commercial outfit called zmanda, who took it more or less commercial, throwing the amanda named version under the buss. They must have ran out of money and resold it to another outfit, who has redoubled their effort to get rid of the free version. One of the things it has not been fixed to do, is issue a wakeup call, and wait for the clients to get their stuff in one sock, say 30 seconds to get everything spun up and ready to take orders, so I'm pretty sure a client that doesn't respond in milliseconds will be skipped. So basically, amanda needs to be officially forked since the current owner, Betsol has not made any contribution to amanda that amounts to an actual update but once in 6 or 7 years now, There's several things it now needs, such as the wake-on-lan support done right. Python is part of it but python 2 is still needed. Or a whole new start for something to replace and put it back squarely in the gplv2 or 3 camp. If there is actually another capable of diddling the level schedule like amanda does, I'm sure I could name some of the major users that would jump ship in a week or so once they became aware of a workalike. It appears Amanda has a script API for both the client and the server: https://manpages.debian.org/buster/amanda-common/amanda-scripts.7.en.html The zmanda wiki has a Script API page, but it is empty (?): https://wiki.zmanda.com/index.php/Script_API. For BIOS/UEFI wake-on-lan, it might be possible to write a script that wakes the clients, to write a script that shuts down the clients, to configure the Amanda server to run the wake script before backups, and to configure the Amanda server to run the shutdown script after backups. For BIOS/UEFI wake at preset day/time, it might be possible to set the clients to wake before the scheduled backup time, to write a script that shuts down the clients, and to configure the Amanda server to run the shutdown script after backups. David
Re: smartctl cannot access my storage, need syntax help
On 1/21/24 18:29, David Christensen wrote: On 1/21/24 14:48, gene heskett wrote: On 1/21/24 16:13, David Christensen wrote: On 1/21/24 03:47, gene heskett wrote: On 1/21/24 01:33, David Christensen wrote: I am still uncertain if those are internal SSD errors or SATA errors. Please check if you see matching errors in dmesg(1). There aren't any. Those hours would very closely correspond to my attempts to rsync and the OOM deamon killed the machine, which it did around 10 times. So logging by then had been killed. That to me is the smoking gun. Kernel ring buffer is renewed with each boot and newer messages overwrite older messages. So, you will want to save or clear the ring buffer with demsg(1), save a SMART full report, exercise the disk with dd(1) and/or a SMART test, save the ring buffer, save a SMART full report, and analyze everything to see if you have disk problems, SATA problems, and/or system problems. Once everything passes without error, the disk is ready to be put into service. 2T is enough /home for the nonce. so I'll do the rsync thing going the other direction, using it for a backup of /home until I'm ready for trixie. However I am tempted to zero the drives an recreate the raid w/o formatting since the mdadm seems capable to installing itw own filesystems to use the whole drive unpartitioned, giving me a backup that sizewise is about the same as the single 2T drive has now. And although my single experience with lvm over a decade ago was a total disaster, made out of used spinning rust I may now see how the other 4 2T's assembled as a lvm for amandas vtapes as an 8T lvm to backup the whole system, which in addition to the 4 cnc'd machines, has over the last 5 years seen a train of 3d printers go by. If all 3, currently a WIP, get rebuilt, the smallest is 305 by, the largest is 400 by. And all I hope will lay plastic at 200+ mm a second. Normal consumer stuff is 40 to 60. Obviously I have an eclectic choice of too many hobbies. ;o)> Now if curiosity doesn't kill this cat, I need to find some breakfast and git to it. This and other threads have led me to the conclusion that consumer SSD's are meant for devices that are off most of the time -- e.g. notepad, laptop, desktop, and workstation computers. If you put them into a NAS/ file server and run them 24x7, they will die sometime after 2 years. That has not been my experience at all David, I bought a 4 pack of 120G ssd's when they were the biggest available and replaced 3 spinning rust drives that had 50-70k hours on them with these. My cnc machines are all wired so power for the mill/lathe/what have you is totally controlled by the enable key, f2, so if f2 is off only the computer is running. That was at least 6 years ago. Then I installed a 240G as an extra drive on the rpi4 that runs my biggest lathe and made a buildbot out of it to pull linuxcnc-master from github and build it, also armhf kernels for linuxcnc's realtime needs. The 120G disappeared in about a year, replaced the adapter with a startech, drive was and is just fine. There is now at least 5 years on everyone of those original 120's, zero SSD problems in the whole lot. I also have small SSD's that have lasted far longer than 2 years on mixed duty, including 24x7 (Intel SSD 520 Series 60 GB). The relevant recent threads on this list seem to be 1+ TB Samsung's. It is interesting to note that BackBlaze does not seem to use Samsung SSD's: https://www.backblaze.com/blog/ssd-edition-2023-mid-year-drive-stats-review/ 3. For Amanda, either add more HDD's to the storage server or build another server. If another server, shut it down when you are not using it. Speaking as someone who has used amanda for about 25 years: People don't always understand that one of Amanda prime directives is to balance the size of an individual back up run by advancing the level 3 scheduled for tonight, by advancing it to level 0 if this run is only going to be small. The only guarantee is that if you have a 10 day schedule, all machines/dle's, will get that level 0 backup not more than 10 days after the last one. You choose how many days long that cycle is. I adjust it so the storage is around 75 to 80% used after the schedule has stabilized. This may take quite a few such cycles. Designed to run every night when things are relatively quiet, how this works well depends on the other machines it is backing up be available. Machines missing at backup time can and will muck things up for this efficient scheduling. Corporate users of Amanda, used to doing it their way, backing the weeks business on friday nights just don't understand that the Amanda way gets them a 100% coverage backup by backing up only the differences from the previous run of that dle every night is far superior to their fridey night when most of the offices machines are turned off for the weekend. For those cases we recommend
Re: smartctl cannot access my storage, need syntax help
On 1/21/24 14:48, gene heskett wrote: On 1/21/24 16:13, David Christensen wrote: On 1/21/24 03:47, gene heskett wrote: On 1/21/24 01:33, David Christensen wrote: I am still uncertain if those are internal SSD errors or SATA errors. Please check if you see matching errors in dmesg(1). There aren't any. Those hours would very closely correspond to my attempts to rsync and the OOM deamon killed the machine, which it did around 10 times. So logging by then had been killed. That to me is the smoking gun. Kernel ring buffer is renewed with each boot and newer messages overwrite older messages. So, you will want to save or clear the ring buffer with demsg(1), save a SMART full report, exercise the disk with dd(1) and/or a SMART test, save the ring buffer, save a SMART full report, and analyze everything to see if you have disk problems, SATA problems, and/or system problems. Once everything passes without error, the disk is ready to be put into service. 2T is enough /home for the nonce. so I'll do the rsync thing going the other direction, using it for a backup of /home until I'm ready for trixie. However I am tempted to zero the drives an recreate the raid w/o formatting since the mdadm seems capable to installing itw own filesystems to use the whole drive unpartitioned, giving me a backup that sizewise is about the same as the single 2T drive has now. And although my single experience with lvm over a decade ago was a total disaster, made out of used spinning rust I may now see how the other 4 2T's assembled as a lvm for amandas vtapes as an 8T lvm to backup the whole system, which in addition to the 4 cnc'd machines, has over the last 5 years seen a train of 3d printers go by. If all 3, currently a WIP, get rebuilt, the smallest is 305 by, the largest is 400 by. And all I hope will lay plastic at 200+ mm a second. Normal consumer stuff is 40 to 60. Obviously I have an eclectic choice of too many hobbies. ;o)> Now if curiosity doesn't kill this cat, I need to find some breakfast and git to it. This and other threads have led me to the conclusion that consumer SSD's are meant for devices that are off most of the time -- e.g. notepad, laptop, desktop, and workstation computers. If you put them into a NAS/ file server and run them 24x7, they will die sometime after 2 years. That has not been my experience at all David, I bought a 4 pack of 120G ssd's when they were the biggest available and replaced 3 spinning rust drives that had 50-70k hours on them with these. My cnc machines are all wired so power for the mill/lathe/what have you is totally controlled by the enable key, f2, so if f2 is off only the computer is running. That was at least 6 years ago. Then I installed a 240G as an extra drive on the rpi4 that runs my biggest lathe and made a buildbot out of it to pull linuxcnc-master from github and build it, also armhf kernels for linuxcnc's realtime needs. The 120G disappeared in about a year, replaced the adapter with a startech, drive was and is just fine. There is now at least 5 years on everyone of those original 120's, zero SSD problems in the whole lot. I also have small SSD's that have lasted far longer than 2 years on mixed duty, including 24x7 (Intel SSD 520 Series 60 GB). The relevant recent threads on this list seem to be 1+ TB Samsung's. It is interesting to note that BackBlaze does not seem to use Samsung SSD's: https://www.backblaze.com/blog/ssd-edition-2023-mid-year-drive-stats-review/ 3. For Amanda, either add more HDD's to the storage server or build another server. If another server, shut it down when you are not using it. Speaking as someone who has used amanda for about 25 years: People don't always understand that one of Amanda prime directives is to balance the size of an individual back up run by advancing the level 3 scheduled for tonight, by advancing it to level 0 if this run is only going to be small. The only guarantee is that if you have a 10 day schedule, all machines/dle's, will get that level 0 backup not more than 10 days after the last one. You choose how many days long that cycle is. I adjust it so the storage is around 75 to 80% used after the schedule has stabilized. This may take quite a few such cycles. Designed to run every night when things are relatively quiet, how this works well depends on the other machines it is backing up be available. Machines missing at backup time can and will muck things up for this efficient scheduling. Corporate users of Amanda, used to doing it their way, backing the weeks business on friday nights just don't understand that the Amanda way gets them a 100% coverage backup by backing up only the differences from the previous run of that dle every night is far superior to their fridey night when most of the offices machines are turned off for the weekend. For those cases we recommend composing two or more dle files and rigging cron to
Re: smartctl cannot access my storage, need syntax help
On 1/21/24 16:13, David Christensen wrote: On 1/21/24 03:47, gene heskett wrote: On 1/21/24 01:33, David Christensen wrote: I am still uncertain if those are internal SSD errors or SATA errors. Please check if you see matching errors in dmesg(1). There aren't any. Those hours would very closely correspond to my attempts to rsync and the OOM deamon killed the machine, which it did around 10 times. So logging by then had been killed. That to me is the smoking gun. Kernel ring buffer is renewed with each boot and newer messages overwrite older messages. So, you will want to save or clear the ring buffer with demsg(1), save a SMART full report, exercise the disk with dd(1) and/or a SMART test, save the ring buffer, save a SMART full report, and analyze everything to see if you have disk problems, SATA problems, and/or system problems. Once everything passes without error, the disk is ready to be put into service. 2T is enough /home for the nonce. so I'll do the rsync thing going the other direction, using it for a backup of /home until I'm ready for trixie. However I am tempted to zero the drives an recreate the raid w/o formatting since the mdadm seems capable to installing itw own filesystems to use the whole drive unpartitioned, giving me a backup that sizewise is about the same as the single 2T drive has now. And although my single experience with lvm over a decade ago was a total disaster, made out of used spinning rust I may now see how the other 4 2T's assembled as a lvm for amandas vtapes as an 8T lvm to backup the whole system, which in addition to the 4 cnc'd machines, has over the last 5 years seen a train of 3d printers go by. If all 3, currently a WIP, get rebuilt, the smallest is 305 by, the largest is 400 by. And all I hope will lay plastic at 200+ mm a second. Normal consumer stuff is 40 to 60. Obviously I have an eclectic choice of too many hobbies. ;o)> Now if curiosity doesn't kill this cat, I need to find some breakfast and git to it. This and other threads have led me to the conclusion that consumer SSD's are meant for devices that are off most of the time -- e.g. notepad, laptop, desktop, and workstation computers. If you put them into a NAS/ file server and run them 24x7, they will die sometime after 2 years. That has not been my experience at all David, I bought a 4 pack of 120G ssd's when they were the biggest available and replaced 3 spinning rust drives that had 50-70k hours on them with these. My cnc machines are all wired so power for the mill/lathe/what have you is totally controlled by the enable key, f2, so if f2 is off only the computer is running. That was at least 6 years ago. Then I installed a 240G as an extra drive on the rpi4 that runs my biggest lathe and made a buildbot out of it to pull linuxcnc-master from github and build it, also armhf kernels for linuxcnc's realtime needs. The 120G disappeared in about a year, replaced the adapter with a startech, drive was and is just fine. There is now at least 5 years on everyone of those original 120's, zero SSD problems in the whole lot. So, I suggest: 1. Build a storage server using NAS or enterprise HDD's. Use an enterprise SSD or DOM for the OS. Run it 24x7 or shut it down as you like. 2. Use your Asus PRIME Z370-A II as a workstation. Install the WD Black M.2 NVMe PCIe SSD. Connect the optical drive to the first motherboard SATA port. Install Debian onto the WD Black. Then, connect the five Samsung EVO 870's to the remaining motherboard SATA ports. Set them up as a 5-way mirror (RAID1). Use the Samsung RAID as a scratch disk for your 3-D work. As the Samsung's die off, replace them with the Gigastones. Shut it down when you are not using it. 3. For Amanda, either add more HDD's to the storage server or build another server. If another server, shut it down when you are not using it. Speaking as someone who has used amanda for about 25 years: People don't always understand that one of Amanda prime directives is to balance the size of an individual back up run by advancing the level 3 scheduled for tonight, by advancing it to level 0 if this run is only going to be small. The only guarantee is that if you have a 10 day schedule, all machines/dle's, will get that level 0 backup not more than 10 days after the last one. You choose how many days long that cycle is. I adjust it so the storage is around 75 to 80% used after the schedule has stabilized. This may take quite a few such cycles. Designed to run every night when things are relatively quiet, how this works well depends on the other machines it is backing up be available. Machines missing at backup time can and will muck things up for this efficient scheduling. Corporate users of Amanda, used to doing it their way, backing the weeks business on friday nights just don't understand that the Amanda way gets them a 100% coverage backup by backing up only the
Re: smartctl cannot access my storage, need syntax help
On 1/21/24 03:47, gene heskett wrote: On 1/21/24 01:33, David Christensen wrote: I am still uncertain if those are internal SSD errors or SATA errors. Please check if you see matching errors in dmesg(1). There aren't any. Those hours would very closely correspond to my attempts to rsync and the OOM deamon killed the machine, which it did around 10 times. So logging by then had been killed. That to me is the smoking gun. Kernel ring buffer is renewed with each boot and newer messages overwrite older messages. So, you will want to save or clear the ring buffer with demsg(1), save a SMART full report, exercise the disk with dd(1) and/or a SMART test, save the ring buffer, save a SMART full report, and analyze everything to see if you have disk problems, SATA problems, and/or system problems. Once everything passes without error, the disk is ready to be put into service. 2T is enough /home for the nonce. so I'll do the rsync thing going the other direction, using it for a backup of /home until I'm ready for trixie. However I am tempted to zero the drives an recreate the raid w/o formatting since the mdadm seems capable to installing itw own filesystems to use the whole drive unpartitioned, giving me a backup that sizewise is about the same as the single 2T drive has now. And although my single experience with lvm over a decade ago was a total disaster, made out of used spinning rust I may now see how the other 4 2T's assembled as a lvm for amandas vtapes as an 8T lvm to backup the whole system, which in addition to the 4 cnc'd machines, has over the last 5 years seen a train of 3d printers go by. If all 3, currently a WIP, get rebuilt, the smallest is 305 by, the largest is 400 by. And all I hope will lay plastic at 200+ mm a second. Normal consumer stuff is 40 to 60. Obviously I have an eclectic choice of too many hobbies. ;o)> Now if curiosity doesn't kill this cat, I need to find some breakfast and git to it. This and other threads have led me to the conclusion that consumer SSD's are meant for devices that are off most of the time -- e.g. notepad, laptop, desktop, and workstation computers. If you put them into a NAS/ file server and run them 24x7, they will die sometime after 2 years. So, I suggest: 1. Build a storage server using NAS or enterprise HDD's. Use an enterprise SSD or DOM for the OS. Run it 24x7 or shut it down as you like. 2. Use your Asus PRIME Z370-A II as a workstation. Install the WD Black M.2 NVMe PCIe SSD. Connect the optical drive to the first motherboard SATA port. Install Debian onto the WD Black. Then, connect the five Samsung EVO 870's to the remaining motherboard SATA ports. Set them up as a 5-way mirror (RAID1). Use the Samsung RAID as a scratch disk for your 3-D work. As the Samsung's die off, replace them with the Gigastones. Shut it down when you are not using it. 3. For Amanda, either add more HDD's to the storage server or build another server. If another server, shut it down when you are not using it. David
Re: smartctl cannot access my storage, need syntax help
On 1/21/24 04:35, Max Nikulin wrote: On 21/01/2024 03:23, gene heskett wrote: Right now nothing in the system is north of 32C, might get to 36C at the end of a 9 minute build of something in OpenSCAD. I would say that 53°C and even 44°C is well above 36°C you expected: On 21/01/2024 12:48, gene heskett wrote: SCT Status Version: 3 SCT Version (vendor specific): 256 (0x0100) Device State: DST executing in background (3) Current Temperature: 28 Celsius Power Cycle Min/Max Temperature: 26/44 Celsius Lifetime Min/Max Temperature: 24/53 Celsius Specified Max Operating Temperature: 70 Celsius Under/Over Temperature Limit Count: 0/0 Device Statistics (GP Log 0x04) 0x05 = = = === == Temperature Statistics (rev 1) == 0x05 0x008 1 28 --- Current Temperature 0x05 0x020 1 53 --- Highest Temperature 0x05 0x028 1 24 --- Lowest Temperature 0x05 0x058 1 70 --- Specified Maximum Operating Temperature IIRC the fan in the front of an upper drive cage got unplugged for a while, half an hour maybe, about a year ago while I was doing my annual D on it. These SSD's all of them have a label claiming they need 5 volts and 1 amp, that is 5 watts, but I don't think that is a steady load, probably only when writing at 500+ mhz, ! watt or less of heat is much closer to normal operation. Thank you, take care, stay warm, dry and well, Max. Having a heat wave here, its up to 21F out at 12:25 pm here, 16" of white stuff on the front deck, got cold & had to replace the battery's in my smart t-stat about an hour ago. Cheers, Gene Heskett. -- "There are four boxes to be used in defense of liberty: soap, ballot, jury, and ammo. Please use in that order." -Ed Howdershelt (Author, 1940) If we desire respect for the law, we must first make the law respectable. - Louis D. Brandeis
Re: smartctl cannot access my storage, need syntax help
On 1/21/24 01:33, David Christensen wrote: On 1/20/24 21:48, gene heskett wrote: New -x version for this SSD attached > SMART Attributes Data Structure revision number: 1 > Vendor Specific SMART Attributes with Thresholds: > ID# ATTRIBUTE_NAME FLAGS VALUE WORST THRESH FAIL RAW_VALUE > 5 Reallocated_Sector_Ct PO--CK 094 094 010 - 64 > 183 Runtime_Bad_Block PO--C- 094 094 010 - 64 > 187 Uncorrectable_Error_Cnt -O--CK 099 099 000 - 392 > 195 ECC_Error_Rate -O-RC- 199 199 000 - 392 > 199 CRC_Error_Count -OSRCK 099 099 000 - 2 Those attributes are worrisome. Especially Reallocated_Sector_Ct and Runtime_Bad_Block -- I am confident those are inside the SSD. > 9 Power_On_Hours -O--CK 095 095 000 - 21194 That is equivalent to 10.2 years at 40 hours/week. Machine runs 24/7/365.25 > 241 Total_LBAs_Written -O--CK 099 099 000 - 38429262625 TBW specification for 1 TB drive is 600TB. You are at 19.7. relatively low IOW. > Error 466 [1] occurred at disk power-on lifetime: 21078 hours (878 days + 6 hours) > When the command that caused the error occurred, the device was active or idle. > > After command completion occurred, registers were: > ER -- ST COUNT LBA_48 LH LM LL DV DC > -- -- -- == -- == == == -- -- -- -- -- > 40 -- 51 00 40 00 00 1b a4 0d 18 40 00 Error: WP at LBA = 0x1ba40d18 = 463736088 > > Commands leading to the command that caused the error were: > CR FEATR COUNT LBA_48 LH LM LL DV DC Powered_Up_Time Command/Feature_Name > -- == -- == -- == == == -- -- -- -- -- --- > 61 00 08 00 40 00 00 1b a4 0d 18 40 08 1d+03:35:20.430 WRITE FPDMA QUEUED > 60 0a 00 00 38 00 00 70 f1 a4 00 40 07 1d+03:35:20.430 READ FPDMA QUEUED > 60 07 80 00 30 00 00 70 f1 3c 80 40 06 1d+03:35:20.430 READ FPDMA QUEUED > 61 00 28 00 28 00 00 1b a4 0d 38 40 05 1d+03:35:20.430 WRITE FPDMA QUEUED > 47 00 00 00 01 00 00 00 00 00 00 40 02 1d+03:35:20.430 READ LOG DMA EXT > > Error 465 [0] occurred at disk power-on lifetime: 21078 hours (878 days + 6 hours) > ... > Error 464 [3] occurred at disk power-on lifetime: 21078 hours (878 days + 6 hours) > ... > Error 463 [2] occurred at disk power-on lifetime: 21078 hours (878 days + 6 hours) I am still uncertain if those are internal SSD errors or SATA errors. Please check if you see matching errors in dmesg(1). There aren't any. Those hours would very closely correspond to my attempts to rsync and the OOM deamon killed the machine, which it did around 10 times. So logging by then had been killed. That to me is the smoking gun. 2T is enough /home for the nonce. so I'll do the rsync thing going the other direction, using it for a backup of /home until I'm ready for trixie. However I am tempted to zero the drives an recreate the raid w/o formatting since the mdadm seems capable to installing itw own filesystems to use the whole drive unpartitioned, giving me a backup that sizewise is about the same as the single 2T drive has now. And although my single experience with lvm over a decade ago was a total disaster, made out of used spinning rust I may now see how the other 4 2T's assembled as a lvm for amandas vtapes as an 8T lvm to backup the whole system, which in addition to the 4 cnc'd machines, has over the last 5 years seen a train of 3d printers go by. If all 3, currently a WIP, get rebuilt, the smallest is 305 by, the largest is 400 by. And all I hope will lay plastic at 200+ mm a second. Normal consumer stuff is 40 to 60. Obviously I have an eclectic choice of too many hobbies. ;o)> Now if curiosity doesn't kill this cat, I need to find some breakfast and git to it. Thank you David, take care, stay warm dry and well. David . Cheers, Gene Heskett. -- "There are four boxes to be used in defense of liberty: soap, ballot, jury, and ammo. Please use in that order." -Ed Howdershelt (Author, 1940) If we desire respect for the law, we must first make the law respectable. - Louis D. Brandeis
Re: smartctl cannot access my storage, need syntax help
On 21/01/2024 03:23, gene heskett wrote: Right now nothing in the system is north of 32C, might get to 36C at the end of a 9 minute build of something in OpenSCAD. I would say that 53°C and even 44°C is well above 36°C you expected: On 21/01/2024 12:48, gene heskett wrote: SCT Status Version: 3 SCT Version (vendor specific): 256 (0x0100) Device State:DST executing in background (3) Current Temperature:28 Celsius Power Cycle Min/Max Temperature: 26/44 Celsius LifetimeMin/Max Temperature: 24/53 Celsius Specified Max Operating Temperature:70 Celsius Under/Over Temperature Limit Count: 0/0 Device Statistics (GP Log 0x04) 0x05 = = = === == Temperature Statistics (rev 1) == 0x05 0x008 1 28 --- Current Temperature 0x05 0x020 1 53 --- Highest Temperature 0x05 0x028 1 24 --- Lowest Temperature 0x05 0x058 1 70 --- Specified Maximum Operating Temperature
Re: smartctl cannot access my storage, need syntax help
On 1/20/24 21:48, gene heskett wrote: New -x version for this SSD attached > SMART Attributes Data Structure revision number: 1 > Vendor Specific SMART Attributes with Thresholds: > ID# ATTRIBUTE_NAME FLAGSVALUE WORST THRESH FAIL RAW_VALUE > 5 Reallocated_Sector_Ct PO--CK 094 094 010-64 > 183 Runtime_Bad_Block PO--C- 094 094 010-64 > 187 Uncorrectable_Error_Cnt -O--CK 099 099 000-392 > 195 ECC_Error_Rate -O-RC- 199 199 000-392 > 199 CRC_Error_Count -OSRCK 099 099 000-2 Those attributes are worrisome. Especially Reallocated_Sector_Ct and Runtime_Bad_Block -- I am confident those are inside the SSD. > 9 Power_On_Hours -O--CK 095 095 000-21194 That is equivalent to 10.2 years at 40 hours/week. > 241 Total_LBAs_Written -O--CK 099 099 000-38429262625 TBW specification for 1 TB drive is 600TB. You are at 19.7. > Error 466 [1] occurred at disk power-on lifetime: 21078 hours (878 days + 6 hours) > When the command that caused the error occurred, the device was active or idle. > > After command completion occurred, registers were: > ER -- ST COUNT LBA_48 LH LM LL DV DC > -- -- -- == -- == == == -- -- -- -- -- > 40 -- 51 00 40 00 00 1b a4 0d 18 40 00 Error: WP at LBA = 0x1ba40d18 = 463736088 > > Commands leading to the command that caused the error were: > CR FEATR COUNT LBA_48 LH LM LL DV DC Powered_Up_Time Command/Feature_Name > -- == -- == -- == == == -- -- -- -- -- --- > 61 00 08 00 40 00 00 1b a4 0d 18 40 08 1d+03:35:20.430 WRITE FPDMA QUEUED > 60 0a 00 00 38 00 00 70 f1 a4 00 40 07 1d+03:35:20.430 READ FPDMA QUEUED > 60 07 80 00 30 00 00 70 f1 3c 80 40 06 1d+03:35:20.430 READ FPDMA QUEUED > 61 00 28 00 28 00 00 1b a4 0d 38 40 05 1d+03:35:20.430 WRITE FPDMA QUEUED > 47 00 00 00 01 00 00 00 00 00 00 40 02 1d+03:35:20.430 READ LOG DMA EXT > > Error 465 [0] occurred at disk power-on lifetime: 21078 hours (878 days + 6 hours) > ... > Error 464 [3] occurred at disk power-on lifetime: 21078 hours (878 days + 6 hours) > ... > Error 463 [2] occurred at disk power-on lifetime: 21078 hours (878 days + 6 hours) I am still uncertain if those are internal SSD errors or SATA errors. Please check if you see matching errors in dmesg(1). David
Re: smartctl cannot access my storage, need syntax help
On 1/21/24 00:30, Max Nikulin wrote: On 21/01/2024 03:23, gene heskett wrote: On 1/20/24 10:24, Max Nikulin wrote: On 19/01/2024 06:10, gene heskett wrote: ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE 190 Airflow_Temperature_Cel 0x0032 071 049 000 Old_age Always - 29 Initial 100 decreased to 49 means that sometimes the drive is hot enough. I've been under the impression that 100C was the absolute temp limit Do not confuse normalized values (100 means shiny new, 0 means really old or damaged) and RAW_VALUE. For some drives smartctl -x may report history of temperature measurements, but I think summer values are already unavailable. and it not been over 36C that I know of according to gkrellm which s set to monitor that stuff in real time. Right now nothing in the system is north of 32C, might get to 36C 71 <-> 29 °C and 49 <-> 36 °C mapping might be possible, but I would expect higher temperature for 49. I read up on the manpage. New -x version for this SSD attached Cheers, Gene Heskett. -- "There are four boxes to be used in defense of liberty: soap, ballot, jury, and ammo. Please use in that order." -Ed Howdershelt (Author, 1940) If we desire respect for the law, we must first make the law respectable. - Louis D. Brandeis smartctl 7.3 2022-02-28 r5338 [x86_64-linux-6.1.0-17-rt-amd64] (local build) Copyright (C) 2002-22, Bruce Allen, Christian Franke, www.smartmontools.org === START OF INFORMATION SECTION === Model Family: Samsung based SSDs Device Model: Samsung SSD 870 EVO 1TB Serial Number:S626NF0R302498T LU WWN Device Id: 5 002538 f413394a5 Firmware Version: SVT01B6Q User Capacity:1,000,204,886,016 bytes [1.00 TB] Sector Size: 512 bytes logical/physical Rotation Rate:Solid State Device Form Factor: 2.5 inches TRIM Command: Available, deterministic, zeroed Device is:In smartctl database 7.3/5319 ATA Version is: ACS-4 T13/BSR INCITS 529 revision 5 SATA Version is: SATA 3.3, 6.0 Gb/s (current: 6.0 Gb/s) Local Time is:Sun Jan 21 00:44:08 2024 EST SMART support is: Available - device has SMART capability. SMART support is: Enabled AAM feature is: Unavailable APM feature is: Unavailable Rd look-ahead is: Enabled Write cache is: Enabled DSN feature is: Unavailable ATA Security is: Disabled, NOT FROZEN [SEC1] Wt Cache Reorder: Enabled === START OF READ SMART DATA SECTION === SMART overall-health self-assessment test result: PASSED General SMART Values: Offline data collection status: (0x00) Offline data collection activity was never started. Auto Offline Data Collection: Disabled. Self-test execution status: ( 117) The previous self-test completed having the read element of the test failed. Total time to complete Offline data collection:(0) seconds. Offline data collection capabilities:(0x53) SMART execute Offline immediate. Auto Offline data collection on/off support. Suspend Offline collection upon new command. No Offline surface scan supported. Self-test supported. No Conveyance Self-test supported. Selective Self-test supported. SMART capabilities:(0x0003) Saves SMART data before entering power-saving mode. Supports SMART auto save timer. Error logging capability:(0x01) Error logging supported. General Purpose Logging supported. Short self-test routine recommended polling time:( 2) minutes. Extended self-test routine recommended polling time:( 85) minutes. SCT capabilities: (0x003d) SCT Status supported. SCT Error Recovery Control supported. SCT Feature Control supported. SCT Data Table supported. SMART Attributes Data Structure revision number: 1 Vendor Specific SMART Attributes with Thresholds: ID# ATTRIBUTE_NAME FLAGSVALUE WORST THRESH FAIL RAW_VALUE 5 Reallocated_Sector_Ct PO--CK 094 094 010-64 9 Power_On_Hours -O--CK 095 095 000-21194 12 Power_Cycle_Count -O--CK 099 099 000-86 177 Wear_Leveling_Count PO--C- 099 099 000-23 179 Used_Rsvd_Blk_Cnt_Tot PO--C- 094 094 010-64 181 Program_Fail_Cnt_Total -O--CK 100 100 010-0 182 Erase_Fail_Count_Total -O--CK 100 100 010-0 183
Re: smartctl cannot access my storage, need syntax help
On 21/01/2024 03:23, gene heskett wrote: On 1/20/24 10:24, Max Nikulin wrote: On 19/01/2024 06:10, gene heskett wrote: ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE 190 Airflow_Temperature_Cel 0x0032 071 049 000 Old_age Always - 29 Initial 100 decreased to 49 means that sometimes the drive is hot enough. I've been under the impression that 100C was the absolute temp limit Do not confuse normalized values (100 means shiny new, 0 means really old or damaged) and RAW_VALUE. For some drives smartctl -x may report history of temperature measurements, but I think summer values are already unavailable. and it not been over 36C that I know of according to gkrellm which s set to monitor that stuff in real time. Right now nothing in the system is north of 32C, might get to 36C 71 <-> 29 °C and 49 <-> 36 °C mapping might be possible, but I would expect higher temperature for 49. # 2 Extended offline Completed: read failure 50% 10917 1847474376 # 3 Extended offline Completed: read failure 50% 10586 1847474376 May it happen that disk firmware does not remap failed sectors to allow the user to identify what file is damaged? IDK Max. I know the microware os9 file system well enough to connect the dots, but have little knowledge for how one might do this with ext4. If you are motivated enough then docs either for badblocks or for some data recovery software may give you a recipe. A search engine should help to find it.
Re: smartctl cannot access my storage, need syntax help
On 1/20/24 10:24, Max Nikulin wrote: On 19/01/2024 06:10, gene heskett wrote: SMART Attributes Data Structure revision number: 1 Vendor Specific SMART Attributes with Thresholds: ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE 179 Used_Rsvd_Blk_Cnt_Tot 0x0013 085 085 010 Pre-fail Always - 168 183 Runtime_Bad_Block 0x0013 085 085 010 Pre-fail Always - 168 85 is still far enough from 10, however the change is noticeable. 190 Airflow_Temperature_Cel 0x0032 071 049 000 Old_age Always - 29 Initial 100 decreased to 49 means that sometimes the drive is hot enough. I've been under the impression that 100C was the absolute temp limit, and it not been over 36C that I know of according to gkrellm which s set to monitor that stuff in real time. Right now nothing in the system is north of 32C, might get to 36C at the end of a 9 minute build of something in OpenSCAD. On the other hand the raw value of 29 is likely centigrade degrees and it is not really hot for the normalized value of 71. this is true, all reported temps are in C. SMART Self-test log structure revision number 1 Num Test_Description Status Remaining LifeTime(hours) LBA_of_first_error # 1 Extended offline Completed: read failure 50% 21128 1847474744 # 2 Extended offline Completed: read failure 50% 10917 1847474376 # 3 Extended offline Completed: read failure 50% 10586 1847474376 May it happen that disk firmware does not remap failed sectors to allow the user to identify what file is damaged? IDK Max. I know the microware os9 file system well enough to connect the dots, but have little knowledge for how one might do this with ext4. Thanks Max, take care & stay well. Cheers, Gene Heskett. -- "There are four boxes to be used in defense of liberty: soap, ballot, jury, and ammo. Please use in that order." -Ed Howdershelt (Author, 1940) If we desire respect for the law, we must first make the law respectable. - Louis D. Brandeis
Re: smartctl cannot access my storage, need syntax help
On 19/01/2024 06:10, gene heskett wrote: SMART Attributes Data Structure revision number: 1 Vendor Specific SMART Attributes with Thresholds: ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE 179 Used_Rsvd_Blk_Cnt_Tot 0x0013 085 085 010Pre-fail Always - 168 183 Runtime_Bad_Block 0x0013 085 085 010Pre-fail Always - 168 85 is still far enough from 10, however the change is noticeable. 190 Airflow_Temperature_Cel 0x0032 071 049 000Old_age Always - 29 Initial 100 decreased to 49 means that sometimes the drive is hot enough. On the other hand the raw value of 29 is likely centigrade degrees and it is not really hot for the normalized value of 71. SMART Self-test log structure revision number 1 Num Test_DescriptionStatus Remaining LifeTime(hours) LBA_of_first_error # 1 Extended offlineCompleted: read failure 50% 21128 1847474744 # 2 Extended offlineCompleted: read failure 50% 10917 1847474376 # 3 Extended offlineCompleted: read failure 50% 10586 1847474376 May it happen that disk firmware does not remap failed sectors to allow the user to identify what file is damaged?
Re: smartctl cannot access my storage, need syntax help
On 1/19/24 21:34, gene heskett wrote: On 1/19/24 20:29, Felix Miata wrote: gene heskett composed on 2024-01-19 19:09 (UTC-0500): On 1/19/24 15:56, David Christensen wrote: https://www.cablematters.com/pc-187-156-3-pack-straight-60-gbps-sata-iii-cable.aspx Cheap enough at 18", ordered 4 packs of 3 for service & build stock, thanks David. Among the elements of that page, opened in web browser lacking JS support, was absence of a price, and also were the following "features": Serial ATA/150 and Fast data transfer rate of up to 150 Mbps Those describe SATA revision 1.0 (1.5 Gbit/s), not SATA revision 2.0 (300MB/s, 3.0 Gbit/s), not SATA revision 3.0 (600MB/s, 6.0 Gbit/s). https://en.wikipedia.org/wiki/SATA With JS enabled, the page radically changed to show $8.49 for a 3-pack of 6.0 Gbit/s cables. They had 2 lengths, 24" will if everything isn't good, sign on as sata-II but the 18" I bought claim sata-III. I bought black cables, 18" and 24", straight-straight and straight-90. The older ones are labeled "Serial ATA 6G". The newer ones are labeled "Serial ATA3.2". David
Re: smartctl cannot access my storage, need syntax help
On 1/19/24 20:29, Felix Miata wrote: gene heskett composed on 2024-01-19 19:09 (UTC-0500): On 1/19/24 15:56, David Christensen wrote: No sign of that snipped stuff. https://www.cablematters.com/pc-187-156-3-pack-straight-60-gbps-sata-iii-cable.aspx Cheap enough at 18", ordered 4 packs of 3 for service & build stock, thanks David. Among the elements of that page, opened in web browser lacking JS support, was absence of a price, and also were the following "features": Serial ATA/150 and Fast data transfer rate of up to 150 Mbps Those describe SATA revision 1.0 (1.5 Gbit/s), not SATA revision 2.0 (300MB/s, 3.0 Gbit/s), not SATA revision 3.0 (600MB/s, 6.0 Gbit/s). https://en.wikipedia.org/wiki/SATA With JS enabled, the page radically changed to show $8.49 for a 3-pack of 6.0 Gbit/s cables. They had 2 lengths, 24" will if everything isn't good, sign on as sata-II but the 18" I bought claim sata-III. Cheers, Gene Heskett. -- "There are four boxes to be used in defense of liberty: soap, ballot, jury, and ammo. Please use in that order." -Ed Howdershelt (Author, 1940) If we desire respect for the law, we must first make the law respectable. - Louis D. Brandeis
Re: smartctl cannot access my storage, need syntax help
gene heskett composed on 2024-01-19 19:09 (UTC-0500): > On 1/19/24 15:56, David Christensen wrote: > No sign of that snipped stuff. > >> https://www.cablematters.com/pc-187-156-3-pack-straight-60-gbps-sata-iii-cable.aspx > Cheap enough at 18", ordered 4 packs of 3 for service & build stock, > thanks David. Among the elements of that page, opened in web browser lacking JS support, was absence of a price, and also were the following "features": Serial ATA/150 and Fast data transfer rate of up to 150 Mbps Those describe SATA revision 1.0 (1.5 Gbit/s), not SATA revision 2.0 (300MB/s, 3.0 Gbit/s), not SATA revision 3.0 (600MB/s, 6.0 Gbit/s). https://en.wikipedia.org/wiki/SATA With JS enabled, the page radically changed to show $8.49 for a 3-pack of 6.0 Gbit/s cables. -- Evolution as taught in public schools is, like religion, based on faith, not based on science. Team OS/2 ** Reg. Linux User #211409 ** a11y rocks! Felix Miata
Re: smartctl cannot access my storage, need syntax help
On 1/19/24 15:56, David Christensen wrote: No sign of that snipped stuff. https://www.cablematters.com/pc-187-156-3-pack-straight-60-gbps-sata-iii-cable.aspx Cheap enough at 18", ordered 4 packs of 3 for service & build stock, thanks David. I call that the "wiggle" test. So do I but I've had to explain it. Several times. Now they'll have to dig me out, got around 16" of white stuff in the last 36 hrs. I believe winter has arrived. Cheers, Gene Heskett. -- "There are four boxes to be used in defense of liberty: soap, ballot, jury, and ammo. Please use in that order." -Ed Howdershelt (Author, 1940) If we desire respect for the law, we must first make the law respectable. - Louis D. Brandeis
Re: smartctl cannot access my storage, need syntax help
On 1/18/24 23:23, gene heskett wrote: On 1/19/24 00:55, David Christensen wrote: I am unclear if those errors are inside the SSD or if they are the SATA communications link between the SSD and the motherbaord or HBA port and/or main memory (?). Does dmesg(1) show anything? I'm not sure what I should be looking for, and I don't see anything that is looping to correct an error. Suggested grep targets? Here is a dmesg(1) excerpt from 2014 -- Debian 7, good SSD, bad SATA cable: [2.086360] ata3.00: ATA-9: INTEL SSDSC2CW060A3, 400i, max UDMA/133 [2.086365] ata3.00: 117231408 sectors, multi 16: LBA48 NCQ (depth 31/32), AA [2.096265] ata3.00: configured for UDMA/133 [ 14.718054] EXT4-fs (dm-0): mounted filesystem with ordered data mode. Opts: (null) [ 18.449227] EXT4-fs (sda1): mounted filesystem with ordered data mode. Opts: (null) [ 20.157693] ata3.00: exception Emask 0x10 SAct 0x40 SErr 0xc1 action 0x6 frozen [ 20.157699] ata3.00: irq_stat 0x0800, interface fatal error [ 20.157703] ata3: SError: { RecovData Handshk LinkSeq } [ 20.157709] ata3.00: failed command: WRITE FPDMA QUEUED [ 20.157716] ata3.00: cmd 61/08:b0:a0:e0:61/00:00:00:00:00/40 tag 22 ncq 4096 out [ 20.157721] ata3.00: status: { DRDY } [ 20.157727] ata3: hard resetting link [ 20.473489] ata3: SATA link up 6.0 Gbps (SStatus 133 SControl 300) [ 20.484835] ata3.00: configured for UDMA/133 [ 20.484847] ata3: EH complete [ 21.059825] ata3.00: exception Emask 0x10 SAct 0x4000 SErr 0x400100 action 0x6 frozen [ 21.059831] ata3.00: irq_stat 0x0800, interface fatal error [ 21.059835] ata3: SError: { UnrecovData Handshk } [ 21.059840] ata3.00: failed command: WRITE FPDMA QUEUED [ 21.059848] ata3.00: cmd 61/08:70:50:e2:61/00:00:00:00:00/40 tag 14 ncq 4096 out [ 21.059853] ata3.00: status: { DRDY } [ 21.059859] ata3: hard resetting link [ 21.376135] ata3: SATA link up 6.0 Gbps (SStatus 133 SControl 300) [ 21.397234] ata3.00: configured for UDMA/133 [ 21.397246] ata3: EH complete [ 22.590805] ata3.00: exception Emask 0x10 SAct 0x600 SErr 0x400100 action 0x6 frozen [ 22.590811] ata3.00: irq_stat 0x0800, interface fatal error [ 22.590815] ata3: SError: { UnrecovData Handshk } [ 22.590819] ata3.00: failed command: WRITE FPDMA QUEUED [ 22.590826] ata3.00: cmd 61/08:48:f0:ee:1d/00:00:00:00:00/40 tag 9 ncq 4096 out [ 22.590831] ata3.00: status: { DRDY } [ 22.590834] ata3.00: failed command: WRITE FPDMA QUEUED [ 22.590840] ata3.00: cmd 61/08:50:70:ef:1d/00:00:00:00:00/40 tag 10 ncq 4096 out [ 22.590844] ata3.00: status: { DRDY } [ 22.590851] ata3: hard resetting link [ 22.909955] ata3: SATA link up 6.0 Gbps (SStatus 133 SControl 300) [ 22.921525] ata3.00: configured for UDMA/133 [ 22.937878] ata3: EH complete [ 22.938635] ata3: limiting SATA link speed to 3.0 Gbps [ 22.938638] ata3.00: exception Emask 0x10 SAct 0x40 SErr 0x400100 action 0x6 frozen [ 22.938640] ata3.00: irq_stat 0x0800, interface fatal error [ 22.938642] ata3: SError: { UnrecovData Handshk } [ 22.938645] ata3.00: failed command: WRITE FPDMA QUEUED [ 22.938648] ata3.00: cmd 61/60:b0:20:28:66/00:00:00:00:00/40 tag 22 ncq 49152 out [ 22.938650] ata3.00: status: { DRDY } [ 22.938652] ata3: hard resetting link [ 23.257418] ata3: SATA link up 3.0 Gbps (SStatus 123 SControl 320) [ 23.269251] ata3.00: configured for UDMA/133 [ 23.285387] ata3: EH complete In any case, make sure that you are using SATA III 6 Gbps cables with locking connectors for your drives and that all the connections are good. That's hard to verify once the cables are removed from the packing. all are black, with locking clips There is a cable maker under every tree in china so I'n not swearing any are up to specs, I've had cable problem in the past but usually a magenta colored on that is over 2 years old, If you have a known good src on straight on cables, please share. You would be doing everyone a favor. https://www.cablematters.com/pc-187-156-3-pack-straight-60-gbps-sata-iii-cable.aspx https://www.cablematters.com/pc-188-156-cable-matters-3-pack-90-degree-right-angle-60-gbps-sata-iii-cable-18-inches.aspx Test what you have by taking a wooden stick and moving each one a centimeter or so, if the log blows up with sata resets, bingo, bad cable. replace it asap. I call that the "wiggle" test. David
Re: smartctl cannot access my storage, need syntax help
On 1/19/24 00:03, Anssi Saari wrote: My only mdraid was on raw partitions but that never had any issues. I think zfs effectively does the same, no partitions. You can do it either way on ZFS. David
Re: To partition or not to partition MD arrays (Was Re: smartctl cannotaccess my storage, need syntax help)
On 19/01/24 at 20:14, Nicolas George wrote: Franco Martelli (12024-01-19): One case against using partitions on mdraid: if your array gets messed up, you get to recreate those partition tables yourself and that's just hilarious if you don't have a backup. Happened to a friend of mine, reason was a UPS brownout. How can I get a backup of mdadm RAID partition? You do not need a backup of the RAID partitions, that would be terribly inefficient. You need a backup of the partition table. Yes, I agree of course. I was asking this to Anssi because it looks like strange to me to have the backup of the partitions, as he pointed (for my understanding) Which, if you are organized, you already have in $notes_dir/$hostname/install.md as something that looks like this: ``` sudo sfdisk /dev/sdX < The partitions table of my HDD is part of my backup. Cheers, -- Franco Martelli
Re: To partition or not to partition MD arrays (Was Re: smartctl cannotaccess my storage, need syntax help)
Franco Martelli (12024-01-19): > > One case against using partitions on mdraid: if your array gets messed > > up, you get to recreate those partition tables yourself and that's just > > hilarious if you don't have a backup. Happened to a friend of mine, > > reason was a UPS brownout. > How can I get a backup of mdadm RAID partition? You do not need a backup of the RAID partitions, that would be terribly inefficient. You need a backup of the partition table. Which, if you are organized, you already have in $notes_dir/$hostname/install.md as something that looks like this: ``` sudo sfdisk /dev/sdX < signature.asc Description: PGP signature
Re: To partition or not to partition MD arrays (Was Re: smartctl cannotaccess my storage, need syntax help)
On 19/01/24 at 09:03, Anssi Saari wrote: One case against using partitions on mdraid: if your array gets messed up, you get to recreate those partition tables yourself and that's just hilarious if you don't have a backup. Happened to a friend of mine, reason was a UPS brownout. How can I get a backup of mdadm RAID partition? And which tool to backup the whole disks of an array? The only tool that it comes in mind it is "dd" that it isn't a viable solution for me. I think is useless to backup the raw data stored in a partition or the whole disk. I backup files and directories stored in the filesystem not raw data. If an error occurs in the RAID, mdadm takes care to warn me via email... I hope! I think he scanned his disks for copies of the superblock but didn't find any and then somehow with a lot of hassle eventually figured out what the partition tables were. So in a catastrophe, partition tables are one more obstacle to cross before you can start actually recovering your data. Me too ran into a catastrophe scenario, I had lost /dev/md0, the reason was using hibernate (suspend to disk) in a logical volume placed inside the RAID. I think it was damaged the RAID metadata. I got rid of this using Debian-installer, I thought that I had loosed everything and I prepared for reinstall, when Debian-installer asked me to create the new RAID I specify all the four partitions, I saved, and magically the logical device and all my logical volumes, embedded in the old RAID, reappeared. To partition was not a trouble in those circumstances. My only mdraid was on raw partitions but that never had any issues. I think zfs effectively does the same, no partitions. Which raw partitions? Maybe did you mean without partitions? I never used zfs it's full featured, I prefer to keep the things simple: RAID -> LVM -> ext4 Cheers, -- Franco Martelli
Re: smartctl cannot access my storage, need syntax help
On 1/19/24 04:50, Thomas Schmitt wrote: Hi, Anssi Saari It does seem strange to me, even in MS-DOS era I was able to set a terminal scrollback to 5000 lines without issue, when RAM was maybe 4 MB and a DOS terminal program probably had access to way less than that. I have no problems with 130 xterms of 10,000 lines each. So does rsync really generate gigabytes of verbose output? rsync can be extremely verbose when the number of transferred files is very high. Or is xfce-terminal storing the scrollback in a very inefficient way? I would not be astonished to learn that the luxury ornamented terminals of the various desktops waste many extra bytes when memorizing plain text. But the real bug is the fact that the scroll back memory is unlimited and can summon the OOM killer. (I imagine it like the Discworld Death of Rats.) If i were a user of Xfce i would report this as bug to its Debian maintainer. Bug title "xfce-terminal: A landmine on the kids' playground". Have a nice day :) Thomas . Excellent description Thomas. Love it. Cheers, Gene Heskett. -- "There are four boxes to be used in defense of liberty: soap, ballot, jury, and ammo. Please use in that order." -Ed Howdershelt (Author, 1940) If we desire respect for the law, we must first make the law respectable. - Louis D. Brandeis
Re: smartctl cannot access my storage, need syntax help
On 1/19/24 03:12, Anssi Saari wrote: gene heskett writes: The OOM death of the system was the xfce4 terminal apparently being set for unlimited scrollback and that was eating the memory. Switching to Konsole with has the ability to control the scrollback to 200 lines, and its taken all 32G's as .cache and 1536 1k blocks of swap, and its working w/o any OOM actions I've detected. It does seem strange to me, even in MS-DOS era I was able to set a terminal scrollback to 5000 lines without issue, when RAM was maybe 4 MB and a DOS terminal program probably had access to way less than that. So does rsync really generate gigabytes of verbose output? Or is xfce-terminal storing the scrollback in a very inefficient way? That I can't answer, other than -v outputs a full from / pathlist to everyfile it touchs, and if storing that in ram, I can sure see it eating 32G very quickly when it is moving 335G, it only got around 13G moved before OOM struck and killed the system, on each of probably 15 attempts. Knowing that the tech of an SSD and the common micro-sd has a relatively limited actual write speed after in has used up its input cache of fast ram, I took the v off the -av, and them limited it to 10megs a second, it took around 9 hours and the system acted normally, no OOM problens. I have edited the /etc/stab and am now running on that copy for /home. The raid is now automounted to /raid10 and says its valid, despite the 4th drives log being a mess. My thoughts are to reverse the copy and put it in crontab to keep an uptodate backup of /home until I can re-invent my wrappers for amanda. /home is by far the biggest glop of data, and none of my printers or cnc machines will use more that 10G reach, so I'm inclinded to think of the other 8T of drives as an lvm managed 8T, which should give me room enough to keep 30 days worth of amanda's way of doing things. But I'm hibernating for the nonce, I woke up at 6 with 6" of new snow on the deck, and the weather fabricators are promising another 24 hours of that, might wind up with 3 or 4 feet of it. I've got coffee, the freezers are well stocked. Boring but safe. All the messy logs were at hour 21027 so that was a single actual event, probably caused by OOM. Take care Anssi, stay warm, dry and well where ever you are. Cheers, Gene Heskett. -- "There are four boxes to be used in defense of liberty: soap, ballot, jury, and ammo. Please use in that order." -Ed Howdershelt (Author, 1940) If we desire respect for the law, we must first make the law respectable. - Louis D. Brandeis
Re: smartctl cannot access my storage, need syntax help
Hi, Anssi Saari > It does seem strange to me, even in MS-DOS era I was able to set a > terminal scrollback to 5000 lines without issue, when RAM was maybe 4 MB > and a DOS terminal program probably had access to way less than that. I have no problems with 130 xterms of 10,000 lines each. > So does rsync really generate gigabytes of verbose output? rsync can be extremely verbose when the number of transferred files is very high. > Or is xfce-terminal storing the scrollback in a very inefficient way? I would not be astonished to learn that the luxury ornamented terminals of the various desktops waste many extra bytes when memorizing plain text. But the real bug is the fact that the scroll back memory is unlimited and can summon the OOM killer. (I imagine it like the Discworld Death of Rats.) If i were a user of Xfce i would report this as bug to its Debian maintainer. Bug title "xfce-terminal: A landmine on the kids' playground". Have a nice day :) Thomas
Re: smartctl cannot access my storage, need syntax help
gene heskett writes: > The OOM death of the system was the xfce4 terminal apparently being > set for unlimited scrollback and that was eating the memory. Switching > to Konsole with has the ability to control the scrollback to 200 > lines, and its taken all 32G's as .cache and 1536 1k blocks of swap, > and its working w/o any OOM actions I've detected. It does seem strange to me, even in MS-DOS era I was able to set a terminal scrollback to 5000 lines without issue, when RAM was maybe 4 MB and a DOS terminal program probably had access to way less than that. So does rsync really generate gigabytes of verbose output? Or is xfce-terminal storing the scrollback in a very inefficient way?
Re: smartctl cannot access my storage, need syntax help
Franco Martelli writes: > I don't know if it is a good idea, in fact it exists a special > partition type for RAID array listed in fdisk, I used that for my > RAID: One case against using partitions on mdraid: if your array gets messed up, you get to recreate those partition tables yourself and that's just hilarious if you don't have a backup. Happened to a friend of mine, reason was a UPS brownout. I think he scanned his disks for copies of the superblock but didn't find any and then somehow with a lot of hassle eventually figured out what the partition tables were. So in a catastrophe, partition tables are one more obstacle to cross before you can start actually recovering your data. My only mdraid was on raw partitions but that never had any issues. I think zfs effectively does the same, no partitions.
Re: smartctl cannot access my storage, need syntax help
On 1/19/24 00:55, David Christensen wrote: On 1/18/24 15:10, gene heskett wrote: On 1/18/24 16:08, David Christensen wrote: On 1/18/24 03:47, gene heskett wrote: I have issued a smartctl -tlong on all 4 drives, results in about 3 hours. A SMART long test should find and fix any read errors. Which has now been done on all 4 SSD. but the log is still a mess. 4th one in particular, smartctl -a /dev/sdg attached. 179 Used_Rsvd_Blk_Cnt_Tot 0x0013 085 085 010 Pre-fail Always - 168 183 Runtime_Bad_Block 0x0013 085 085 010 Pre-fail Always - 168 187 Uncorrectable_Error_Cnt 0x0032 099 099 000 Old_age Always - 3275 195 ECC_Error_Rate 0x001a 199 199 000 Old_age Always - 3275 Error 3332 occurred at disk power-on lifetime: 21027 hours (876 days + 3 hours) When the command that caused the error occurred, the device was active or idle. After command completion occurred, registers were: ER ST SC SN CL CH DH -- -- -- -- -- -- -- 40 51 38 e8 ea 67 40 Error: WP at LBA = 0x0067eae8 = 6810344 Commands leading to the command that caused the error were: CR FR SC SN CL CH DH DC Powered_Up_Time Command/Feature_Name -- -- -- -- -- -- -- -- 61 18 38 e8 ea 67 40 07 15:17:03.046 WRITE FPDMA QUEUED 60 00 30 00 5e a9 40 06 15:17:03.046 READ FPDMA QUEUED 60 28 28 00 f4 87 40 05 15:17:03.046 READ FPDMA QUEUED 60 00 20 00 7c a9 40 04 15:17:03.046 READ FPDMA QUEUED 60 00 18 00 4a a9 40 03 15:17:03.046 READ FPDMA QUEUED Error 3331 occurred at disk power-on lifetime: 21027 hours (876 days + 3 hours) Error 3330 occurred at disk power-on lifetime: 21027 hours (876 days + 3 hours) Error 3329 occurred at disk power-on lifetime: 21027 hours (876 days + 3 hours) Error 3328 occurred at disk power-on lifetime: 21027 hours (876 days + 3 hours) I am unclear if those errors are inside the SSD or if they are the SATA communications link between the SSD and the motherbaord or HBA port and/or main memory (?). Does dmesg(1) show anything? I'm not sure what I should be looking for, and I don't see anything that is looping to correct an error. Suggested grep targets? In any case, make sure that you are using SATA III 6 Gbps cables with locking connectors for your drives and that all the connections are good. That's hard to verify once the cables are removed from the packing. all are black, with locking clips There is a cable maker under every tree in china so I'n not swearing any are up to specs, I've had cable problem in the past but usually a magenta colored on that is over 2 years old, If you have a known good src on straight on cables, please share. You would be doing everyone a favor. No hot red need apply. People think its pretty, but the die that gives the color, eats the copper in the cable. I am the src of the internet legend about that, first observed in the early 1970's when all the cb radio mic cables switched from dull red to this bright red/magemta as the tx wire in multiconductor cables. And that wire literally dissolved the copper in the hot red conductor to a dull rusty powder in 2 years. And its been doing that same failure in sata cables of that color for a decade now. Test what you have by taking a wooden stick and moving each one a centimeter or so, if the log blows up with sata resets, bingo, bad cable. replace it asap. When deploying an SSD into a new role, I like to do a "secure erase" followed by a SMART long test. not fam with that, I usually just reformat. But I'll not do that until I have amanda running again. Secure erase will erase all of the blocks in the drive, including those that are held in reserve. This both verifies that each block can be erased, and provides maximum performance what you put the disk into service and start writing to it. Thanks David, take care & stay well Likewise. :-) David . Cheers, Gene Heskett. -- "There are four boxes to be used in defense of liberty: soap, ballot, jury, and ammo. Please use in that order." -Ed Howdershelt (Author, 1940) If we desire respect for the law, we must first make the law respectable. - Louis D. Brandeis
Re: smartctl cannot access my storage, need syntax help
On 1/18/24 15:10, gene heskett wrote: On 1/18/24 16:08, David Christensen wrote: On 1/18/24 03:47, gene heskett wrote: I have issued a smartctl -tlong on all 4 drives, results in about 3 hours. A SMART long test should find and fix any read errors. Which has now been done on all 4 SSD. but the log is still a mess. 4th one in particular, smartctl -a /dev/sdg attached. 179 Used_Rsvd_Blk_Cnt_Tot 0x0013 085 085 010Pre-fail Always - 168 183 Runtime_Bad_Block 0x0013 085 085 010Pre-fail Always - 168 187 Uncorrectable_Error_Cnt 0x0032 099 099 000Old_age Always - 3275 195 ECC_Error_Rate 0x001a 199 199 000Old_age Always - 3275 Error 3332 occurred at disk power-on lifetime: 21027 hours (876 days + 3 hours) When the command that caused the error occurred, the device was active or idle. After command completion occurred, registers were: ER ST SC SN CL CH DH -- -- -- -- -- -- -- 40 51 38 e8 ea 67 40 Error: WP at LBA = 0x0067eae8 = 6810344 Commands leading to the command that caused the error were: CR FR SC SN CL CH DH DC Powered_Up_Time Command/Feature_Name -- -- -- -- -- -- -- -- 61 18 38 e8 ea 67 40 07 15:17:03.046 WRITE FPDMA QUEUED 60 00 30 00 5e a9 40 06 15:17:03.046 READ FPDMA QUEUED 60 28 28 00 f4 87 40 05 15:17:03.046 READ FPDMA QUEUED 60 00 20 00 7c a9 40 04 15:17:03.046 READ FPDMA QUEUED 60 00 18 00 4a a9 40 03 15:17:03.046 READ FPDMA QUEUED Error 3331 occurred at disk power-on lifetime: 21027 hours (876 days + 3 hours) Error 3330 occurred at disk power-on lifetime: 21027 hours (876 days + 3 hours) Error 3329 occurred at disk power-on lifetime: 21027 hours (876 days + 3 hours) Error 3328 occurred at disk power-on lifetime: 21027 hours (876 days + 3 hours) I am unclear if those errors are inside the SSD or if they are the SATA communications link between the SSD and the motherbaord or HBA port and/or main memory (?). Does dmesg(1) show anything? In any case, make sure that you are using SATA III 6 Gbps cables with locking connectors for your drives and that all the connections are good. When deploying an SSD into a new role, I like to do a "secure erase" followed by a SMART long test. not fam with that, I usually just reformat. But I'll not do that until I have amanda running again. Secure erase will erase all of the blocks in the drive, including those that are held in reserve. This both verifies that each block can be erased, and provides maximum performance what you put the disk into service and start writing to it. Thanks David, take care & stay well Likewise. :-) David
Re: smartctl cannot access my storage, need syntax help
On Thu 18 Jan 2024 at 00:57:07 (-0800), David Christensen wrote: > On 1/17/24 22:44, gene heskett wrote: > > One thing that bothers me is there is no way the installers parted > > shows partition names for non-raid disks. To me that is a serious > > bug. It appears from the help that it can LABEL a partition but > > can't read that LABEL. > > When installing to UEFI/GPT, I am able to label partitions in the > Debian Installer, the labels are visible in the installer, and the > labels persist on disk after installation is complete. Agreed, and that doesn't depend on UEFI; MBR/GPT disks show the same behaviour. But those are PARTLABELS. But it may be that Gene meant filesystem LABELs. Gene, to check/display the LABELs, just place, in turn, the highlight on the line for each partition, like: │ > #5 31.5 GBext4Viva-B▒ │ press Return for it to display: │ Partition settings: │ │ │ │Name: Viva-B│ │Use as:Ext4 journaling file system │ │ │ │Format the partition: yes, format it│ │Mount point: / │ │Mount options: defaults │ │Label: viva05 ←← │ │Reserved blocks: 5%│ │ │ │Done setting up the partition│ where Name: ⇒ PARTLABEL and Label: ⇒ LABEL. Then select "Done setting up …" or to back out each time. Cheers, David.
Re: smartctl cannot access my storage, need syntax help
On 1/18/24 16:08, David Christensen wrote: On 1/18/24 03:47, gene heskett wrote: On 1/18/24 03:57, David Christensen wrote: The old /home RAID10 still has its metadata on disk. I would install the "mdadm" package, edit /etc/fstab, copy and rework the old /home line (new mount point, add option "ro"), create the mount point, and mount. I believe mdadm is already installed. At least enough to collect and mount this raid10 and use it for /home for the last nearly 2 years. I made the suggestion to install the "mdadm" package because I thought you were going to do a fresh install of Debian. Now after all this folderall, all 4 of the SSD's are reporting read errors at very high lba's. all 4 drives are reporting the same poh, 21027 hours for the occurence of the error, that sounds like it could be just one crash or dirty power down. In which case it s/b repairable Do we have a repair utility that will force the drive to reallocate a spare sector and fix those? I have issued a smartctl -tlong on all 4 drives, results in about 3 hours. A SMART long test should find and fix any read errors. Which has now been done on all 4 SSD. but the log is still a mess. 4th one in particular, smartctl -a /dev/sdg attached. When deploying an SSD into a new role, I like to do a "secure erase" followed by a SMART long test. not fam with that, I usually just reformat. But I'll not do that until I have amanda running again. Thanks David, take care & stay well David . Cheers, Gene Heskett. -- "There are four boxes to be used in defense of liberty: soap, ballot, jury, and ammo. Please use in that order." -Ed Howdershelt (Author, 1940) If we desire respect for the law, we must first make the law respectable. - Louis D. Brandeis smartctl 7.3 2022-02-28 r5338 [x86_64-linux-6.1.0-17-rt-amd64] (local build) Copyright (C) 2002-22, Bruce Allen, Christian Franke, www.smartmontools.org === START OF INFORMATION SECTION === Model Family: Samsung based SSDs Device Model: Samsung SSD 870 EVO 1TB Serial Number:S626NF0R302509W LU WWN Device Id: 5 002538 f413394b0 Firmware Version: SVT01B6Q User Capacity:1,000,204,886,016 bytes [1.00 TB] Sector Size: 512 bytes logical/physical Rotation Rate:Solid State Device Form Factor: 2.5 inches TRIM Command: Available, deterministic, zeroed Device is:In smartctl database 7.3/5319 ATA Version is: ACS-4 T13/BSR INCITS 529 revision 5 SATA Version is: SATA 3.3, 6.0 Gb/s (current: 6.0 Gb/s) Local Time is:Thu Jan 18 18:02:48 2024 EST SMART support is: Available - device has SMART capability. SMART support is: Enabled === START OF READ SMART DATA SECTION === SMART overall-health self-assessment test result: PASSED General SMART Values: Offline data collection status: (0x00) Offline data collection activity was never started. Auto Offline Data Collection: Disabled. Self-test execution status: ( 117) The previous self-test completed having the read element of the test failed. Total time to complete Offline data collection:(0) seconds. Offline data collection capabilities:(0x53) SMART execute Offline immediate. Auto Offline data collection on/off support. Suspend Offline collection upon new command. No Offline surface scan supported. Self-test supported. No Conveyance Self-test supported. Selective Self-test supported. SMART capabilities:(0x0003) Saves SMART data before entering power-saving mode. Supports SMART auto save timer. Error logging capability:(0x01) Error logging supported. General Purpose Logging supported. Short self-test routine recommended polling time:( 2) minutes. Extended self-test routine recommended polling time:( 85) minutes. SCT capabilities: (0x003d) SCT Status supported. SCT Error Recovery Control supported. SCT Feature Control supported. SCT Data Table supported. SMART Attributes Data Structure revision number: 1 Vendor Specific SMART Attributes with Thresholds: ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE 5 Reallocated_Sector_Ct 0x0033 085 085 010Pre-fail Always - 168 9 Power_On_Hours 0x0032 095 095 000Old_age Always - 21139 12 Power_Cycle_Count
Re: smartctl cannot access my storage, need syntax help
On 1/18/24 03:47, gene heskett wrote: On 1/18/24 03:57, David Christensen wrote: The old /home RAID10 still has its metadata on disk. I would install the "mdadm" package, edit /etc/fstab, copy and rework the old /home line (new mount point, add option "ro"), create the mount point, and mount. I believe mdadm is already installed. At least enough to collect and mount this raid10 and use it for /home for the last nearly 2 years. I made the suggestion to install the "mdadm" package because I thought you were going to do a fresh install of Debian. Now after all this folderall, all 4 of the SSD's are reporting read errors at very high lba's. all 4 drives are reporting the same poh, 21027 hours for the occurence of the error, that sounds like it could be just one crash or dirty power down. In which case it s/b repairable Do we have a repair utility that will force the drive to reallocate a spare sector and fix those? I have issued a smartctl -tlong on all 4 drives, results in about 3 hours. A SMART long test should find and fix any read errors. When deploying an SSD into a new role, I like to do a "secure erase" followed by a SMART long test. David
Re: To partition or not to partition MD arrays (Was Re: smartctl cannotaccess my storage, need syntax help)
Hi, On Thu, Jan 18, 2024 at 10:28:30AM -0600, Nicholas Geovanis wrote: > Sounds like this group has finally achieved a long overdue consensus. How > many times since LVM was ready for root/boot volumes have I been told that > using partitions was necessary good practice. Even had that in job > interviews, where half the team would grin at me saying it and the other > half scowling at my "poor practice". > > Now we know it was just personal preference all along. Like somebody said > :-) Look, if you're going to resolve this thread so quickly all it means is that someone is going to have to mention home.arpa or their time zone setting again. We have strict quotas here for the amount of circular repeating "you don't do things like me therefore you are wrong and here are a selection of Internet standards to back me up" threads that must be taking place at once. Thanks, Andy -- https://bitfolk.com/ -- No-nonsense VPS hosting
Re: smartctl cannot access my storage, need syntax help
On 2024-01-17, Thomas Schmitt wrote: > Hi, > > Curt wrote: >> I discovered a couple of discussions of the phenomenon, the upshot of which >> were: >> 1) That's what you get when you purchase cheap SSDs. >> https://www.reddit.com/r/truenas/comments/s0rrpo/two_sata_ssds_with_identical_serial_numbers/ >> 2) SSDs belonging to the same software RAID show identical serial numbers >> in software, but these numbers don't match the serial numbers printed on the >> SSDs themselves. >> https://www.reddit.com/r/truenas/comments/s0rrpo/two_sata_ssds_with_identical_serial_numbers/ > > Those URLs are identical. (OMG ! Is it contageous ?) Human error may very be: https://www.reddit.com/r/synology/comments/18fe6ez/how_to_fix_2_drives_with_same_serial_number/ > Number 2 would match my suspicion that some layer in the disk driving > gets confused and mixes up the serial numbers. > > >> But you said *similar*. > > By "colliding serial numbers" i mean indeed "identical serial numbers". > > How cheap the disks may ever be, that would be no excuse for not making > them individually distinguishable. > > >> As Gene's threads have too many movable parts >> for me to follow, on that point I couldn't say. > > This one begins to gain presence in the web. So one can use search engines > and AI to untangle its sub-threads. I meanwhile participate in two of them: > serial number collision, rsync caused OOM killer (solved now, but how ?). > > > Have a nice day :) > > Thomas > > --
normally start new xterms [was: Re: smartctl cannot access my storage, need syntax help]
On 18/01/2024 04:20, Thomas Schmitt wrote: I normally start new xterms by xterm -ls -geometry 80x24 -bg wheat -fg black -sl 1 +sb & Options may be put into ~/.Xresources xterm*vt100.saveLines: 1 xterm*VT100.background: wheat xterm*VT100.foreground: black ! etc Use xrdb to merge changes without restarting X session. It is possible to have several presets (-name or -class), see /etc/X11/app-defaults/
Re: To partition or not to partition MD arrays (Was Re: smartctl cannotaccess my storage, need syntax help)
On Wed, Jan 17, 2024, 9:35 PM gene heskett wrote: > On 1/17/24 19:54, Steve McIntyre wrote: > > Andy Smith wrote: > ... > >> Then there will just be people going by taste. > >> > >> Personally I still put them directly on drives. If I ever get taken > >> out by one of those crappy motherboards, I reserve the right to get > >> a different religion. > > > > I'm clearly a member of a third group of people,,, :-) > > > > Putting partitions on the RAID drives helps *me* identify them. > > > you aren't alone Steve. > Cheers, Gene Heskett. > Sounds like this group has finally achieved a long overdue consensus. How many times since LVM was ready for root/boot volumes have I been told that using partitions was necessary good practice. Even had that in job interviews, where half the team would grin at me saying it and the other half scowling at my "poor practice". Now we know it was just personal preference all along. Like somebody said :-) >
Re: To partition or not to partition MD arrays (Was Re: smartctl cannot access my storage, need syntax help)
Hey Andy. Andy Smith wrote: > >On Thu, Jan 18, 2024 at 12:53:43AM +, Steve McIntyre wrote: >> I'm clearly a member of a third group of people,,, :-) > >Oh, I didn't mean to imply that those going by taste were in a >minority! Taste, or possibly, "just never thought about it" could >well be the biggest group. I was only talking about my observations >of those who seem to hold strong opinions on this, usually to the >point where they will advocate "their way" to others. ACK! >> Putting partitions on the RAID drives helps *me* identify them. > >So, I don't care what people do and I'm not trying to change your >mind. Would you mind going into what makes "sda1" more identifiable >for you than "sda" though? > >Or is it that you make use of partition labels for some extra info? If I'm looking at disks on a system, the first thing I'll look for is the partition table. If a disk has a partition table with "Linux RAID" partitions viaible, that gives me a strong hint of what I should expect on the disk. Especially if I'm swappings disk around between systems, commisioning new systems and re-using disks etc. -- Steve McIntyre, Cambridge, UK.st...@einval.com Can't keep my eyes from the circling sky, Tongue-tied & twisted, Just an earth-bound misfit, I...
Re: To partition or not to partition MD arrays (Was Re: smartctl cannot access my storage, need syntax help)
Hello, On Thu, Jan 18, 2024 at 12:53:43AM +, Steve McIntyre wrote: > I'm clearly a member of a third group of people,,, :-) Oh, I didn't mean to imply that those going by taste were in a minority! Taste, or possibly, "just never thought about it" could well be the biggest group. I was only talking about my observations of those who seem to hold strong opinions on this, usually to the point where they will advocate "their way" to others. > Putting partitions on the RAID drives helps *me* identify them. So, I don't care what people do and I'm not trying to change your mind. Would you mind going into what makes "sda1" more identifiable for you than "sda" though? Or is it that you make use of partition labels for some extra info? Thanks, Andy -- https://bitfolk.com/ -- No-nonsense VPS hosting
Re: smartctl cannot access my storage, need syntax help
On 1/18/24 03:57, David Christensen wrote: On 1/17/24 22:44, gene heskett wrote:>> On 1/18/24 00:50, David Christensen wrote: The migration took two passes because udev can't make up its alleged mind so I was finally forced to use the rescue mode to edit fstab to mount it by UUID and that worked, I've got /home on the copy right now. Congratulations! :-) and I took the 60 G's of swap out too since I've never used more the 20G with any gfx program, so I figure 47G's on /dev/sda is enough. 1 GB swap works for me. When a memory leak gets out of control, I do not have to wait long for the lock up. So now none of the raid is mounted, but the 30+ second lag when opening a write path is still there, so I was erroneously blaming the raid. So I've narrowed the problem Good to know. but w/o a good clue what to do next. Find the needle in the haystack or do a fresh install. I prefer the latter, because I can estimate the effort and I am reasonably confident of the outcome. One thing that bothers me is there is no way the installers parted shows partition names for non-raid disks. To me that is a serious bug. It appears from the help that it can LABEL a partition but can't read that LABEL. When installing to UEFI/GPT, I am able to label partitions in the Debian Installer, the labels are visible in the installer, and the labels persist on disk after installation is complete. parted when asked to print all does that just fine, but the | doesn't put it to less, so it scrolls off screen the top 60% of a parted's print all output at some fraction of C speed. Not exactly helpful. I have other things to do while I cogitate on what to do next. The following works as expected on my machine: 2024-01-18 00:34:41 root@laalaa ~ # parted -l | less Many thanks to all that helped. YW. :-) If you use rsync(1), I suggest using some kind of integrity checking tool to verify that the source and destination file systems are identical. I prefer BSD mtree(8): I assume I'd have to remount the raid like to /raid? Whew! That's got more arguments than rsync... The old /home RAID10 still has its metadata on disk. I would install the "mdadm" package, edit /etc/fstab, copy and rework the old /home line (new mount point, add option "ro"), create the mount point, and mount. I believe mdadm is already installed. At least enough to collect and mount this raid10 and use it for /home for the last nearly 2 years. Now after all this folderall, all 4 of the SSD's are reporting read errors at very high lba's. all 4 drives are reporting the same poh, 21027 hours for the occurence of the error, that sounds like it could be just one crash or dirty power down. In which case it s/b repairable Do we have a repair utility that will force the drive to reallocate a spare sector and fix those? I have issued a smartctl -tlong on all 4 drives, results in about 3 hours. David . Cheers, Gene Heskett. -- "There are four boxes to be used in defense of liberty: soap, ballot, jury, and ammo. Please use in that order." -Ed Howdershelt (Author, 1940) If we desire respect for the law, we must first make the law respectable. - Louis D. Brandeis
Re: smartctl cannot access my storage, need syntax help
On 1/17/24 22:44, gene heskett wrote:>> On 1/18/24 00:50, David Christensen wrote: The migration took two passes because udev can't make up its alleged mind so I was finally forced to use the rescue mode to edit fstab to mount it by UUID and that worked, I've got /home on the copy right now. Congratulations! :-) and I took the 60 G's of swap out too since I've never used more the 20G with any gfx program, so I figure 47G's on /dev/sda is enough. 1 GB swap works for me. When a memory leak gets out of control, I do not have to wait long for the lock up. So now none of the raid is mounted, but the 30+ second lag when opening a write path is still there, so I was erroneously blaming the raid. So I've narrowed the problem Good to know. but w/o a good clue what to do next. Find the needle in the haystack or do a fresh install. I prefer the latter, because I can estimate the effort and I am reasonably confident of the outcome. One thing that bothers me is there is no way the installers parted shows partition names for non-raid disks. To me that is a serious bug. It appears from the help that it can LABEL a partition but can't read that LABEL. When installing to UEFI/GPT, I am able to label partitions in the Debian Installer, the labels are visible in the installer, and the labels persist on disk after installation is complete. parted when asked to print all does that just fine, but the | doesn't put it to less, so it scrolls off screen the top 60% of a parted's print all output at some fraction of C speed. Not exactly helpful. I have other things to do while I cogitate on what to do next. The following works as expected on my machine: 2024-01-18 00:34:41 root@laalaa ~ # parted -l | less Many thanks to all that helped. YW. :-) If you use rsync(1), I suggest using some kind of integrity checking tool to verify that the source and destination file systems are identical. I prefer BSD mtree(8): I assume I'd have to remount the raid like to /raid? Whew! That's got more arguments than rsync... The old /home RAID10 still has its metadata on disk. I would install the "mdadm" package, edit /etc/fstab, copy and rework the old /home line (new mount point, add option "ro"), create the mount point, and mount. David
Re: smartctl cannot access my storage, need syntax help
Hi, gene heskett wrote: > > where did the extra 19.4G's come from? Can filesystem > > ext4's overhead account for that? In an earlier mail: > > > command line: rsync -a --bwlimit=10m --fsync --progress /home/ > > > /mnt/homevol David Christensen wrote: > Please RTFM rsync(1) to choose your options. These look > useful: >--archive, -a (-rlptgoD) >--delete >--hard-links, -H >--one-file-system, -x >--sparse, -S I bet on --hard-links and --sparse as means to avoid the extra disk space consumption. (--archive is important for other reasons, but it was already in use as -a with your successful rsync run. --delete will be of importance if the rsync run gets repeated on the already filled target directory tree.) man rsync: -H, --hard-links This tells rsync to look for hard-linked files in the source and link together the corresponding files on the destination. With‐ out this option, hard-linked files in the source are treated as though they were separate files. [...] -S, --sparse Try to handle sparse files efficiently so they take up less space on the destination. [...] One can observe a similar inflation effect when copying the files of a Debian installation ISO to hard disk. In the original disk directory on the machine which created the ISO there were hardlinked kernels and firmware packages. In the ISO these link siblings share the same file content storage. But when mounted, the siblings get treated as separate files with different inode numbers. So the 8,135,584 bytes of the hardlink siblings /install.amd/gtk/vmlinuz /install.amd/vmlinuz /install.amd/xen/vmlinuz get triplicated when these three files get copied out of the ISO. I am somewhat astonished that --hard-links is not default in rsync, as it is quite important for backup fidelity. (On the other hand it is some effort to find all siblings on the disk.) Sparse files are files with large areas of 0-bytes. Many filesystems don't store the zeros but rather an instruction to hand out the given number of 0-bytes when requested by a reader. If i were you, i'd let rsync make a complete new copy with --hard-links --sparse, and --delete, but without --bwlimit= in order to get a higher copy fidelity and also to check whether the transfer speed really was not to blame for the appearance of the OOM killer. Have a nice day :) Thomas
Re: smartctl cannot access my storage, need syntax help
On Tue, 16 Jan 2024 21:10:28 -0500 gene heskett wrote: > gene@coyote:~/src/klipper-docs$ lsblk -d -o > NAME,MAJ:MIN,MODEL,SERIAL,WWN /dev/sd[hijkl] > NAME MAJ:MIN MODEL SERIAL WWN > sdh8:112 Gigastone SSD GSTD02TB230102 > sdi8:128 Gigastone SSD GST02TBG221146 > sdj8:144 Gigastone SSD GST02TBG221146 > sdk8:160 Gigastone SSD GSTG02TB230206 > sdl8:176 Gigastone SSD GSTG02TB230206 Something is seriously wrong here. I worked at Maxtor for a while. They went out of their way to be sure there were no duplicate serial numbers. Gene, I suggest you check these SNs with the SN on the packages (if there is one) and on the label on the drive. Also, take each drive, one at a time, attach it to another computer with a fresh installation of Debian, one you haven't mucked with in any way, and only one other drive already in it, and read the SNs there. I also went looking for Gigastone's web site. Every page I tried at gigastone.com led to what I presume was an Error 404 page. I say presume because most of the text was in non-English, probably Chinese, characters. -- Does anybody read signatures any more? https://charlescurley.com https://charlescurley.com/blog/
Re: smartctl cannot access my storage, need syntax help
On 1/18/24 00:50, David Christensen wrote: On 1/17/24 20:20, gene heskett wrote: On 1/17/24 19:58, David Christensen wrote: On 1/17/24 15:58, gene heskett wrote: Now the question is how did it make this: homevol s/b very close to /home in size but: root@coyote:~# df && free Filesystem 1K-blocks Used Available Use% Mounted on udev 16327704 0 16327704 0% /dev tmpfs 3272684 1912 3270772 1% /run /dev/sda1 863983352 22348472 797673232 3% / tmpfs 16363420 1244 16362176 1% /dev/shm tmpfs 5120 8 5112 1% /run/lock /dev/sda3 47749868 784 45291076 1% /tmp /dev/md0p1 1796382580 335102676 1369954928 20% /home tmpfs 3272684 4956 3267728 1% /run/user/1000 /dev/sdh1 1967892164 354519236 1513336680 19% /mnt/homevol total used free shared buff/cache available Mem: 32726840 3417576 515520 934540 30072184 29309264 Swap: 111902712 2048 111900664 root@coyote:~# It somehow changed 335G into 354G. Thinking the AppImages dir is full of soft links of short names pointing at the long filename and had turned the links into duplicates, that was the first thing I checked, but it was all good soft-links, so where did the extra 19.4G's come from? Can filesystem ext4's overhead account for that? I suggest running rsync(1) with --dry-run, --log-file=FILE, --itemize_changes, and whatever other options are needed to find the differences. Please RTFM rsync(1) to choose your options. These look useful: --archive, -a (-rlptgoD) --delete Why --delete? If you have files on the destination from a previous run of rsync(1) and they no longer exist on the source, --delete will get rid of extraneous files on the destination. --hard-links, -H --one-file-system, -x --sparse, -S or --sparse? First, you need to understand what "sparse file" means: https://en.wikipedia.org/wiki/Sparse_file If you have sparse files on the source -- say, 10 GB virtual machine images -- then you want rsync(1) to create sparse files on the destination. Well, my abundance of curiosity, may have killed the cat, but if I understand how rsync's -a works, re-running the same command will only update for the incoming email and any posts I've made while it was running the first time. So the same command quoted last is now running again. when it has exited, which it has now done in about 15 minutes I'll edit fstab to remove the 60 gigs of swap on md1, remove the existing mount of md0p1 as /home taking the raid10 completely out of the system. And add the mounting of LABEL=homevolsdh1 as the /home partition and reboot. In the event I have to re-install, the raid will still contain my data and can be recovered. I already have a dvd with the most recent netinstall burnt. All I have to do is convince it to not install orca and brltty. Probably by unplugging _all_ usb stuff except the keyboard and mouse buttons. What would solve many of my problems is a bit of help from someone who it running trinity to tell me how to install it on a system w/o any installed gui which obviously disables synaptic. That leaves apt, apt-get, and aptitude, unless there is a better way. aptitude is uncontrollable, has fixed me once, has torn the system down to another install 3 times so the odds are not in my favor. So those fstab edits have been done, next is a reboot You should be able to migrate your /home file system from RAID10 to an SSD without needing to reinstall Debian. The migration took two passes because udev can't make up its alleged mind so I was finally forced to use the rescue mode to edit fstab to mount it by UUID and that worked, I've got /home on the copy right now. and I took the 60 G's of swap out too since I've never used more the 20G with any gfx program, so I figure 47G's on /dev/sda is enough. So now none of the raid is mounted, but the 30+ second lag when opening a write path is still there, so I was erroneously blaming the raid. So I've narrowed the problem but w/o a good clue what to do next. One thing that bothers me is there is no way the installers parted shows partition names for non-raid disks. To me that is a serious bug. It appears from the help that it can LABEL a partition but can't read that LABEL. parted when asked to print all does that just fine, but the | doesn't put it to less, so it scrolls off screen the top 60% of a parted's print all output at some fraction of C speed. Not exactly helpful. I have other things to do while I cogitate on what to do next. Many thanks to all that helped. If you use rsync(1), I suggest using some kind of integrity checking tool to verify that the source and destination file systems are identical. I prefer BSD mtree(8): I assume I'd have to remount the raid l
Re: smartctl cannot access my storage, need syntax help
On 1/17/24 20:20, gene heskett wrote: On 1/17/24 19:58, David Christensen wrote: On 1/17/24 15:58, gene heskett wrote: Now the question is how did it make this: homevol s/b very close to /home in size but: root@coyote:~# df && free Filesystem 1K-blocks Used Available Use% Mounted on udev 16327704 0 16327704 0% /dev tmpfs 3272684 1912 3270772 1% /run /dev/sda1 863983352 22348472 797673232 3% / tmpfs 16363420 1244 16362176 1% /dev/shm tmpfs 5120 8 5112 1% /run/lock /dev/sda3 47749868 784 45291076 1% /tmp /dev/md0p1 1796382580 335102676 1369954928 20% /home tmpfs 3272684 4956 3267728 1% /run/user/1000 /dev/sdh1 1967892164 354519236 1513336680 19% /mnt/homevol total used free shared buff/cache available Mem: 32726840 3417576 515520 934540 30072184 29309264 Swap: 111902712 2048 111900664 root@coyote:~# It somehow changed 335G into 354G. Thinking the AppImages dir is full of soft links of short names pointing at the long filename and had turned the links into duplicates, that was the first thing I checked, but it was all good soft-links, so where did the extra 19.4G's come from? Can filesystem ext4's overhead account for that? I suggest running rsync(1) with --dry-run, --log-file=FILE, --itemize_changes, and whatever other options are needed to find the differences. Please RTFM rsync(1) to choose your options. These look useful: --archive, -a (-rlptgoD) --delete Why --delete? If you have files on the destination from a previous run of rsync(1) and they no longer exist on the source, --delete will get rid of extraneous files on the destination. --hard-links, -H --one-file-system, -x --sparse, -S or --sparse? First, you need to understand what "sparse file" means: https://en.wikipedia.org/wiki/Sparse_file If you have sparse files on the source -- say, 10 GB virtual machine images -- then you want rsync(1) to create sparse files on the destination. Well, my abundance of curiosity, may have killed the cat, but if I understand how rsync's -a works, re-running the same command will only update for the incoming email and any posts I've made while it was running the first time. So the same command quoted last is now running again. when it has exited, which it has now done in about 15 minutes I'll edit fstab to remove the 60 gigs of swap on md1, remove the existing mount of md0p1 as /home taking the raid10 completely out of the system. And add the mounting of LABEL=homevolsdh1 as the /home partition and reboot. In the event I have to re-install, the raid will still contain my data and can be recovered. I already have a dvd with the most recent netinstall burnt. All I have to do is convince it to not install orca and brltty. Probably by unplugging _all_ usb stuff except the keyboard and mouse buttons. What would solve many of my problems is a bit of help from someone who it running trinity to tell me how to install it on a system w/o any installed gui which obviously disables synaptic. That leaves apt, apt-get, and aptitude, unless there is a better way. aptitude is uncontrollable, has fixed me once, has torn the system down to another install 3 times so the odds are not in my favor. So those fstab edits have been done, next is a reboot You should be able to migrate your /home file system from RAID10 to an SSD without needing to reinstall Debian. Copying a file system that is mounted read-write is problematic. It is best to remount it read-only, and then copy. This is hard to do when you are logged in and using the file system you want to copy. Options include rebooting into single-user root console or using live media. To make an exact copy of the source, consider using a tool designed for this task -- such as cpio(1), tar(1), or a backup/restore system such as amanda(8). If you use rsync(1), I suggest using some kind of integrity checking tool to verify that the source and destination file systems are identical. I prefer BSD mtree(8): https://manpages.debian.org/bullseye/mtree-netbsd/mtree.8.en.html (Be careful not to confuse the above with mtree(5) via libarchive.) David
Re: smartctl cannot access my storage, need syntax help
On 1/17/24 19:58, David Christensen wrote: On 1/17/24 15:58, gene heskett wrote: Now the question is how did it make this: homevol s/b very close to /home in size but: root@coyote:~# df && free Filesystem 1K-blocks Used Available Use% Mounted on udev 16327704 0 16327704 0% /dev tmpfs 3272684 1912 3270772 1% /run /dev/sda1 863983352 22348472 797673232 3% / tmpfs 16363420 1244 16362176 1% /dev/shm tmpfs 5120 8 5112 1% /run/lock /dev/sda3 47749868 784 45291076 1% /tmp /dev/md0p1 1796382580 335102676 1369954928 20% /home tmpfs 3272684 4956 3267728 1% /run/user/1000 /dev/sdh1 1967892164 354519236 1513336680 19% /mnt/homevol total used free shared buff/cache available Mem: 32726840 3417576 515520 934540 30072184 29309264 Swap: 111902712 2048 111900664 root@coyote:~# It somehow changed 335G into 354G. Thinking the AppImages dir is full of soft links of short names pointing at the long filename and had turned the links into duplicates, that was the first thing I checked, but it was all good soft-links, so where did the extra 19.4G's come from? Can filesystem ext4's overhead account for that? I suggest running rsync(1) with --dry-run, --log-file=FILE, --itemize_changes, and whatever other options are needed to find the differences. Please RTFM rsync(1) to choose your options. These look useful: --archive, -a (-rlptgoD) --delete Why --delete? --hard-links, -H --one-file-system, -x --sparse, -S or --sparse? Well, my abundance of curiosity, may have killed the cat, but if I understand how rsync's -a works, re-running the same command will only update for the incoming email and any posts I've made while it was running the first time. So the same command quoted last is now running again. when it has exited, which it has now done in about 15 minutes I'll edit fstab to remove the 60 gigs of swap on md1, remove the existing mount of md0p1 as /home taking the raid10 completely out of the system. And add the mounting of LABEL=homevolsdh1 as the /home partition and reboot. In the event I have to re-install, the raid will still contain my data and can be recovered. I already have a dvd with the most recent netinstall burnt. All I have to do is convince it to not install orca and brltty. Probably by unplugging _all_ usb stuff except the keyboard and mouse buttons. What would solve many of my problems is a bit of help from someone who it running trinity to tell me how to install it on a system w/o any installed gui which obviously disables synaptic. That leaves apt, apt-get, and aptitude, unless there is a better way. aptitude is uncontrollable, has fixed me once, has torn the system down to another install 3 times so the odds are not in my favor. So those fstab edits have been done, next is a reboot David . Cheers, Gene Heskett. -- "There are four boxes to be used in defense of liberty: soap, ballot, jury, and ammo. Please use in that order." -Ed Howdershelt (Author, 1940) If we desire respect for the law, we must first make the law respectable. - Louis D. Brandeis
Re: smartctl cannot access my storage, need syntax help
On Wed 17 Jan 2024 at 15:34:09 (-0500), gene heskett wrote: > On 1/17/24 12:27, Thomas Schmitt wrote: > > David Christensen wrote: > > > I suspect the conflicting serial numbers are causing problems in the > > > kernel, > > > as indicated by the /dev/disk/by-id/* problems. > > > > That's not in the kernel but in udev/systemd's process of creating the > > symbolic links in /dev/disk/by-id/. > > It gets /dev/sd[h-l] and /dev/sd[h-l]1 as kernel generated device files. > > But sd[ij] and also sd[hl] show pair-wise the same serial numbers. > > In case of sd[ij] the outcome is mixed: links to sdi and sdj1 survive. > > In case of sd[hl] we see a less strange outcome: sdh and sdh1, while > > sdl and sdl1 are missing. > > > missing because the original command line did not look at sdl. > I added the l and it showed up. No magic. What do you mean, it was "missing"? The original command, which I wrote for you, contained a wildcard, so it doesn't miss anything that's there: root@coyote:~# for j in /dev/disk/by-id/* ; do printf '%s\t%s\n' "$(realpath "$j")" "$j" ; done and there was no sdl in the output from that command. In fact, there was no "l" in your post between the "l" in "realpath", above, and the "l" in "like", below: root@coyote:~# but like I wrote, 2 pairs with identical "serial numbers", so the https://lists.debian.org/debian-user/2024/01/msg00658.html shows this, that no sdl was seen under by-id/. > > The open question (at least to me) is whether it's the disks or the > > controllers or the drivers which cause the duplication. > Neither, a typu in the original command. Cheers, David.
Re: To partition or not to partition MD arrays (Was Re: smartctl cannotaccess my storage, need syntax help)
On 1/17/24 19:54, Steve McIntyre wrote: Andy Smith wrote: The newer set of people recommending partitions are mostly doing so because there's been a few incidents of "helpful" PC motherboards detecting on boot what they think is a corrupt GPT, and replacing it with a blank one, damaging the RAID. This is a real thing that has happened to more than one person; it even got linked on Hacker News I believe. Then there will just be people going by taste. Personally I still put them directly on drives. If I ever get taken out by one of those crappy motherboards, I reserve the right to get a different religion. I'm clearly a member of a third group of people,,, :-) Putting partitions on the RAID drives helps *me* identify them. you aren't alone Steve. Cheers, Gene Heskett. -- "There are four boxes to be used in defense of liberty: soap, ballot, jury, and ammo. Please use in that order." -Ed Howdershelt (Author, 1940) If we desire respect for the law, we must first make the law respectable. - Louis D. Brandeis
Re: smartctl cannot access my storage, need syntax help
On 1/17/24 15:58, gene heskett wrote: Now the question is how did it make this: homevol s/b very close to /home in size but: root@coyote:~# df && free Filesystem 1K-blocks Used Available Use% Mounted on udev 16327704 0 16327704 0% /dev tmpfs 3272684 1912 3270772 1% /run /dev/sda1 863983352 22348472 797673232 3% / tmpfs 16363420 1244 16362176 1% /dev/shm tmpfs 5120 8 5112 1% /run/lock /dev/sda3 47749868 784 45291076 1% /tmp /dev/md0p1 1796382580 335102676 1369954928 20% /home tmpfs 3272684 4956 3267728 1% /run/user/1000 /dev/sdh1 1967892164 354519236 1513336680 19% /mnt/homevol total used free shared buff/cache available Mem: 32726840 3417576 515520 934540 30072184 29309264 Swap: 111902712 2048 111900664 root@coyote:~# It somehow changed 335G into 354G. Thinking the AppImages dir is full of soft links of short names pointing at the long filename and had turned the links into duplicates, that was the first thing I checked, but it was all good soft-links, so where did the extra 19.4G's come from? Can filesystem ext4's overhead account for that? I suggest running rsync(1) with --dry-run, --log-file=FILE, --itemize_changes, and whatever other options are needed to find the differences. Please RTFM rsync(1) to choose your options. These look useful: --archive, -a (-rlptgoD) --delete --hard-links, -H --one-file-system, -x --sparse, -S David
Re: To partition or not to partition MD arrays (Was Re: smartctl cannot access my storage, need syntax help)
Andy Smith wrote: > >The newer set of people recommending partitions are mostly doing so >because there's been a few incidents of "helpful" PC motherboards >detecting on boot what they think is a corrupt GPT, and replacing it >with a blank one, damaging the RAID. This is a real thing that has >happened to more than one person; it even got linked on Hacker News >I believe. > >Then there will just be people going by taste. > >Personally I still put them directly on drives. If I ever get taken >out by one of those crappy motherboards, I reserve the right to get >a different religion. ð I'm clearly a member of a third group of people,,, :-) Putting partitions on the RAID drives helps *me* identify them. -- Steve McIntyre, Cambridge, UK.st...@einval.com Can't keep my eyes from the circling sky, Tongue-tied & twisted, Just an earth-bound misfit, I...
Re: smartctl cannot access my storage, need syntax help
On 1/17/24 16:45, David Christensen wrote: On 1/17/24 12:30, gene heskett wrote: By LABELing the partitions uniquely, that problem so far as I can see, is solved. Okay. So, are you confident that your motherboard ports, HBA ports, and SSD's are all working correctly now? The OOM death of the system was the xfce4 terminal apparently being set for unlimited scrollback and that was eating the memory. Switching to Konsole with has the ability to control the scrollback to 200 lines, and its taken all 32G's as .cache and 1536 1k blocks of swap, and its working w/o any OOM actions I've detected. Okay. Xfce -> Terminal Emulator -> right click on screen -> Preferences -> General -> Scrolling: Scrollback 200 Unlimited scrollback uncheck Using tee(1) would allow you to both monitor progress and save standard output and/or standard error (via shell redirection). A related issue is that lots of standard output can slow a program. Minimizing a terminal can help. Redirecting standard output to a file or to /dev/null can help, especially when done on the remote host while using ssh(1). The best solution is to tell rsync(1) not to generate messages on standard output -- do not use --verbose, do not use --info, do not use --progress, etc.; use --quiet, etc.. All good hints after it is done. Now the question is how did it make this: homevol s/b very close to /home in size but: root@coyote:~# df && free Filesystem 1K-blocks Used Available Use% Mounted on udev 16327704 0 16327704 0% /dev tmpfs 3272684 19123270772 1% /run /dev/sda1 863983352 22348472 797673232 3% / tmpfs16363420 1244 16362176 1% /dev/shm tmpfs5120 8 5112 1% /run/lock /dev/sda347749868 784 45291076 1% /tmp /dev/md0p1 1796382580 335102676 1369954928 20% /home tmpfs 3272684 49563267728 1% /run/user/1000 /dev/sdh1 1967892164 354519236 1513336680 19% /mnt/homevol totalusedfree shared buff/cache available Mem:32726840 3417576 515520 93454030072184 29309264 Swap: 1119027122048 111900664 root@coyote:~# It somehow changed 335G into 354G. Thinking the AppImages dir is full of soft links of short names pointing at the long filename and had turned the links into duplicates, that was the first thing I checked, but it was all good soft-links, so where did the extra 19.4G's come from? Can filesystem ext4's overhead account for that? David Thanks David. . Cheers, Gene Heskett. -- "There are four boxes to be used in defense of liberty: soap, ballot, jury, and ammo. Please use in that order." -Ed Howdershelt (Author, 1940) If we desire respect for the law, we must first make the law respectable. - Louis D. Brandeis
Re: smartctl cannot access my storage, need syntax help
On 1/17/24 12:30, gene heskett wrote: By LABELing the partitions uniquely, that problem so far as I can see, is solved. Okay. So, are you confident that your motherboard ports, HBA ports, and SSD's are all working correctly now? The OOM death of the system was the xfce4 terminal apparently being set for unlimited scrollback and that was eating the memory. Switching to Konsole with has the ability to control the scrollback to 200 lines, and its taken all 32G's as .cache and 1536 1k blocks of swap, and its working w/o any OOM actions I've detected. Okay. Xfce -> Terminal Emulator -> right click on screen -> Preferences -> General -> Scrolling: Scrollback 200 Unlimited scrollbackuncheck Using tee(1) would allow you to both monitor progress and save standard output and/or standard error (via shell redirection). A related issue is that lots of standard output can slow a program. Minimizing a terminal can help. Redirecting standard output to a file or to /dev/null can help, especially when done on the remote host while using ssh(1). The best solution is to tell rsync(1) not to generate messages on standard output -- do not use --verbose, do not use --info, do not use --progress, etc.; use --quiet, etc.. David
Re: smartctl cannot access my storage, need syntax help
On 1/17/24 16:16, Thomas Schmitt wrote: Hi, i wrote: What did finally help ? Just the shorter terminal scroll back memory ? gene heskett wrote: That, and possibly the --bwlimit=10m, giving the SSD time to keep their stuff in one sock. Then i place my bet on the terminal alone. Linux is able to handle disk-to-disk copies that are larger than the available memory. This is a standard use case. How large was it set when your runs caused the OOM killer to act ? different terminal, xfce4's is apparently unlimited but can't find it in the config prefs. I normally start new xterms by xterm -ls -geometry 80x24 -bg wheat -fg black -sl 1 +sb & The -sl option gives the number of lines to be memorized for scrollback. Black-on-wheat is a calmative color combination which does not overwork the eyes. Thank you, I did not know that. Have a nice day :) Thomas . Cheers, Gene Heskett. -- "There are four boxes to be used in defense of liberty: soap, ballot, jury, and ammo. Please use in that order." -Ed Howdershelt (Author, 1940) If we desire respect for the law, we must first make the law respectable. - Louis D. Brandeis
Re: smartctl cannot access my storage, need syntax help
Hi, i wrote: > > What did finally help ? Just the shorter terminal scroll back memory ? gene heskett wrote: > That, and possibly the --bwlimit=10m, giving the SSD time to keep their > stuff in one sock. Then i place my bet on the terminal alone. Linux is able to handle disk-to-disk copies that are larger than the available memory. This is a standard use case. > > How large was it set when your runs caused the OOM killer to act ? > different terminal, xfce4's is apparently unlimited but can't find it in the > config prefs. I normally start new xterms by xterm -ls -geometry 80x24 -bg wheat -fg black -sl 1 +sb & The -sl option gives the number of lines to be memorized for scrollback. Black-on-wheat is a calmative color combination which does not overwork the eyes. Have a nice day :) Thomas
Re: smartctl cannot access my storage, need syntax help
On 1/17/24 09:31, Thomas Schmitt wrote: Hi, David Christensen wrote: I suspect the conflicting serial numbers are causing problems in the kernel, as indicated by the /dev/disk/by-id/* problems. That's not in the kernel but in udev/systemd's process of creating the symbolic links in /dev/disk/by-id/. It gets /dev/sd[h-l] and /dev/sd[h-l]1 as kernel generated device files. But sd[ij] and also sd[hl] show pair-wise the same serial numbers. In case of sd[ij] the outcome is mixed: links to sdi and sdj1 survive. In case of sd[hl] we see a less strange outcome: sdh and sdh1, while sdl and sdl1 are missing. The open question (at least to me) is whether it's the disks or the controllers or the drivers which cause the duplication. Thank you for the explanation. I would still remove them. David
Re: smartctl cannot access my storage, need syntax help
On 1/17/24 15:13, gene heskett wrote: On 1/17/24 11:30, Thomas Schmitt wrote: Hi, after i began enumerating suspects, gene heskett wrote: terminals scroll back memory, I purposely set this particular terminals scrollback to 200 lines with that in mind. How large was it set when your runs caused the OOM killer to act ? different terminal, xfce4's is apparently unlimited but can't find it in the config prefs. I have a good number of xterms with 10,000 lines each. No tabs, no KDE, but 8 fvwm "desktops" (virtual screens) full of terminal windows. 12 workspaces with 1 to 8 tabs open. 32G of main memory. [Request to test the disks one-by-one on some other computer, whether they bear the same serial number at all controllers in all machines.] Not as easily tried, the other 4 are in twin mounts in another portion of the drive cages in this 30" tall tiger direct cage and not too readily accessible w/o tipping the mobo out on its hinged mount. One should raise protest at Gigastone if the disks really have the same serial numbers. But before doing so, one would have to make sure that it is not some weird effect of them all being plugged into that machine at the same time. Should not be a problem if labeled uniquely. And that's easily affected by gparted. One of you made the remark that seems to be the secret password. What did finally help ? Just the shorter terminal scroll back memory ? That, and possibly the --bwlimit=10m, giving the SSD time to keep their stuff in one sock. It would explain why a verbose rsync could summon the OOM killer always around the same stage of progress. But what waste of memory would have to happen with each of the rsync messages ? Everything you see flying by when the -v is in the opts, and some of the pathnames are 250-300 bytes long. (You mentioned LABEL as a possibility. But not as actually used.) Yes I have, repeatedly. Its still, slowly at 10 megs a second, working. I see in your previous mail rsync option --bwlimit=10m . But in the same mail there is an older quote from you that --bwlimit=3m only prolonged the time until the OOM killer appeared. So i wonder whether it would work at a more contemporary speed. I can't change it for testing? Boggles my mind. A probably informative test. But as yet not tested. Self-incrimination: The rest of this mail is off topic. they gave all 7nth graders the Iowa test in 1947, similar to the S/B IQ test but not copyrighted, there fore a lot cheaper, and I came out of that with an equivalent of 147. I was tested in the 1960s but they did not tell the results to kids or parents. We only got recommendations at which of our three types of school we should continue at the age of 10 or 11 years. That I believe was the intention but one of the teachers was a blabbermouth. (So it was not to avoid discrimination of the dumb but rather to avoid that pupils feel more intelligent than their teachers.) That avoidance was untenable, in the 1st semester of my freshman year I got thrown out of the senior physics class for correcting an erroneous statement by the teacher that was patently at odds with Newton's 3rd law of motion. For every action, there is an equal but opposite reaction. Pretty basic stuff. But correcting the teacher in front of the other students was absolutely not to be tolerated. But I felt correcting him AND setting it straight was more important to the rest of the nominally 20 students present than any embarrassment it may have caused him. Same with the papered EE's who can't understand that E=MV2 does not have a speed floor, below which its doesn't work when the electron beam in a klystron amplifier is only moving at a potential of 20,000 volts. The problem not understood is that the amplification is obtained not from a current variation, but a velocity variation induced by a 1 watt signal speeding up or slowing down the passing beam as it traverses the first cavity of 4, the next two to control the bandwidth, the last one picks 30 kilowatts back off the beam by the capacitative coupling effects as the beam goes on thru into a copper funnel cooled by 70 gallons of very pure water to absorb the end of that beam which takes around 125 kilowatts to generate. I forgot to mention that 70 gallons figure is a per minute value supplied by a 15 hp ingersol-rand pump. A semi sealed system that has a 4' wide x8' long x1.5' thick radiator supplied with external cooling air by a another 20 horse motor. Rigged by vent louvers to control the air flow to maintain the water above freezing. That 20 horse had the power to blow that whole louver out into the field behind the building when the modutrol motor that controlled that hot air exit louver failed to open it at signon time one morning. Panic call from the remote control site as it was only about 20F outsid
Re: smartctl cannot access my storage, need syntax help
On 1/17/24 12:27, Thomas Schmitt wrote: Hi, David Christensen wrote: I suspect the conflicting serial numbers are causing problems in the kernel, as indicated by the /dev/disk/by-id/* problems. That's not in the kernel but in udev/systemd's process of creating the symbolic links in /dev/disk/by-id/. It gets /dev/sd[h-l] and /dev/sd[h-l]1 as kernel generated device files. But sd[ij] and also sd[hl] show pair-wise the same serial numbers. In case of sd[ij] the outcome is mixed: links to sdi and sdj1 survive. In case of sd[hl] we see a less strange outcome: sdh and sdh1, while sdl and sdl1 are missing. missing because the original command line did not look at sdl. I added the l and it showed up. No magic. The open question (at least to me) is whether it's the disks or the controllers or the drivers which cause the duplication. Neither, a typu in the original command. Have a nice day :) Thomas . Cheers, Gene Heskett. -- "There are four boxes to be used in defense of liberty: soap, ballot, jury, and ammo. Please use in that order." -Ed Howdershelt (Author, 1940) If we desire respect for the law, we must first make the law respectable. - Louis D. Brandeis
Re: smartctl cannot access my storage, need syntax help
On 1/17/24 12:16, Thomas Schmitt wrote: Hi, Curt wrote: I discovered a couple of discussions of the phenomenon, the upshot of which were: 1) That's what you get when you purchase cheap SSDs. https://www.reddit.com/r/truenas/comments/s0rrpo/two_sata_ssds_with_identical_serial_numbers/ 2) SSDs belonging to the same software RAID show identical serial numbers in software, but these numbers don't match the serial numbers printed on the SSDs themselves. https://www.reddit.com/r/truenas/comments/s0rrpo/two_sata_ssds_with_identical_serial_numbers/ Those URLs are identical. (OMG ! Is it contageous ?) Number 2 would match my suspicion that some layer in the disk driving gets confused and mixes up the serial numbers. But you said *similar*. By "colliding serial numbers" i mean indeed "identical serial numbers". How cheap the disks may ever be, that would be no excuse for not making them individually distinguishable. As Gene's threads have too many movable parts for me to follow, on that point I couldn't say. This one begins to gain presence in the web. So one can use search engines and AI to untangle its sub-threads. I meanwhile participate in two of them: serial number collision, rsync caused OOM killer (solved now, but how ?). By LABELing the partitions uniquely, that problem so far as I can see, is solved. The OOM death of the system was the xfce4 terminal apparently being set for unlimited scrollback and that was eating the memory. Switching to Konsole with has the ability to control the scrollback to 200 lines, and its taken all 32G's as .cache and 1536 1k blocks of swap, and its working w/o any OOM actions I've detected. Have a nice day :) Thomas Cheers, Gene Heskett. -- "There are four boxes to be used in defense of liberty: soap, ballot, jury, and ammo. Please use in that order." -Ed Howdershelt (Author, 1940) If we desire respect for the law, we must first make the law respectable. - Louis D. Brandeis
Re: smartctl cannot access my storage, need syntax help
On 1/17/24 11:38, Curt wrote: On 2024-01-17, Thomas Schmitt wrote: This is just weird. I still have difficulties to believe that any disk manufacturer would hand out disks with colliding serial numbers. I googled for this phenomenon, but except two mails of Gene nothing similar popped up. I discovered a couple of discussions of the phenomenon, the upshot of which were: 1) That's what you get when you purchase cheap SSDs. https://www.reddit.com/r/truenas/comments/s0rrpo/two_sata_ssds_with_identical_serial_numbers/ 2) SSDs belonging to the same software RAID show identical serial numbers in software, but these numbers don't match the serial numbers printed on the SSDs themselves. But the drives in question are not yet and never have been in a raid just plugged in awaiting my putting them to work. https://www.reddit.com/r/truenas/comments/s0rrpo/two_sata_ssds_with_identical_serial_numbers/ But you said *similar*. As Gene's threads have too many movable parts for me to follow, on that point I couldn't say. . Cheers, Gene Heskett. -- "There are four boxes to be used in defense of liberty: soap, ballot, jury, and ammo. Please use in that order." -Ed Howdershelt (Author, 1940) If we desire respect for the law, we must first make the law respectable. - Louis D. Brandeis
Re: smartctl cannot access my storage, need syntax help
On 1/17/24 11:30, Thomas Schmitt wrote: Hi, after i began enumerating suspects, gene heskett wrote: terminals scroll back memory, I purposely set this particular terminals scrollback to 200 lines with that in mind. How large was it set when your runs caused the OOM killer to act ? different terminal, xfce4's is apparently unlimited but can't find it in the config prefs. I have a good number of xterms with 10,000 lines each. No tabs, no KDE, but 8 fvwm "desktops" (virtual screens) full of terminal windows. 12 workspaces with 1 to 8 tabs open. 32G of main memory. [Request to test the disks one-by-one on some other computer, whether they bear the same serial number at all controllers in all machines.] Not as easily tried, the other 4 are in twin mounts in another portion of the drive cages in this 30" tall tiger direct cage and not too readily accessible w/o tipping the mobo out on its hinged mount. One should raise protest at Gigastone if the disks really have the same serial numbers. But before doing so, one would have to make sure that it is not some weird effect of them all being plugged into that machine at the same time. Should not be a problem if labeled uniquely. And that's easily affected by gparted. One of you made the remark that seems to be the secret password. What did finally help ? Just the shorter terminal scroll back memory ? That, and possibly the --bwlimit=10m, giving the SSD time to keep their stuff in one sock. It would explain why a verbose rsync could summon the OOM killer always around the same stage of progress. But what waste of memory would have to happen with each of the rsync messages ? (You mentioned LABEL as a possibility. But not as actually used.) Its still, slowly at 10 megs a second, working. I see in your previous mail rsync option --bwlimit=10m . But in the same mail there is an older quote from you that --bwlimit=3m only prolonged the time until the OOM killer appeared. So i wonder whether it would work at a more contemporary speed. A probably informative test. But as yet not tested. Self-incrimination: The rest of this mail is off topic. they gave all 7nth graders the Iowa test in 1947, similar to the S/B IQ test but not copyrighted, there fore a lot cheaper, and I came out of that with an equivalent of 147. I was tested in the 1960s but they did not tell the results to kids or parents. We only got recommendations at which of our three types of school we should continue at the age of 10 or 11 years. That I believe was the intention but one of the teachers was a blabbermouth. (So it was not to avoid discrimination of the dumb but rather to avoid that pupils feel more intelligent than their teachers.) That avoidance was untenable, in the 1st semester of my freshman year I got thrown out of the senior physics class for correcting an erroneous statement by the teacher that was patently at odds with Newton's 3rd law of motion. For every action, there is an equal but opposite reaction. Pretty basic stuff. But correcting the teacher in front of the other students was absolutely not to be tolerated. But I felt correcting him AND setting it straight was more important to the rest of the nominally 20 students present than any embarrassment it may have caused him. Same with the papered EE's who can't understand that E=MV2 does not have a speed floor, below which its doesn't work when the electron beam in a klystron amplifier is only moving at a potential of 20,000 volts. The problem not understood is that the amplification is obtained not from a current variation, but a velocity variation induced by a 1 watt signal speeding up or slowing down the passing beam as it traverses the first cavity of 4, the next two to control the bandwidth, the last one picks 30 kilowatts back off the beam by the capacitative coupling effects as the beam goes on thru into a copper funnel cooled by 70 gallons of very pure water to absorb the end of that beam which takes around 125 kilowatts to generate. But that beams electrons have mass, another name for weight, and one watt to slow them slows them more than 1 watt to speed them up speeds them up, so at high power levels, the tube is effective longer in terms of the transit time. This puts a time of flight error into the signal we didn't know how to pre-distort for in the 1970's. A very dependable way to generate transmitter power levels that was also not very efficient, 95% of the uhf stations that went dark in those years were bankrupted by the power bills even at 3 cents a kw. So there was a huge financial push to find a better method as that time distortion would have killed hidef tv before it ever got out of the laboritory, And E=MV2 is as valid at 25 mph as it is at C speed, nominally 186,272 miles per second. Yup, I understand Albert Eintein's theory. Di
Re: smartctl cannot access my storage, need syntax help
Hi, i see that i messed up "h" and "k" in my explanation of the fight over the link targets in /dev/disk/by-id. So another attempt: sdh has a unique serial number GSTD02TB230102. Thus we see in https://lists.debian.org/debian-user/2024/01/msg00667.html these two links: /dev/sdh/dev/disk/by-id/ata-Gigastone_SSD_GSTD02TB230102 /dev/sdh1 /dev/disk/by-id/ata-Gigastone_SSD_GSTD02TB230102-part1 sdi and sdj share the serial number GST02TBG221146. So the concurrent attempts to create the links let only these two survive: /dev/sdi/dev/disk/by-id/ata-Gigastone_SSD_GST02TBG221146 /dev/sdj1 /dev/disk/by-id/ata-Gigastone_SSD_GST02TBG221146-part1 sdk and sdl share GSTG02TB230206. The survivors are: /dev/sdk/dev/disk/by-id/ata-Gigastone_SSD_GSTG02TB230206 /dev/sdk1 /dev/disk/by-id/ata-Gigastone_SSD_GSTG02TB230206-part1 The next system startup might yield other survivors. Have a nice day :) Thomas
Re: smartctl cannot access my storage, need syntax help
Hi, David Christensen wrote: > I suspect the conflicting serial numbers are causing problems in the kernel, > as indicated by the /dev/disk/by-id/* problems. That's not in the kernel but in udev/systemd's process of creating the symbolic links in /dev/disk/by-id/. It gets /dev/sd[h-l] and /dev/sd[h-l]1 as kernel generated device files. But sd[ij] and also sd[hl] show pair-wise the same serial numbers. In case of sd[ij] the outcome is mixed: links to sdi and sdj1 survive. In case of sd[hl] we see a less strange outcome: sdh and sdh1, while sdl and sdl1 are missing. The open question (at least to me) is whether it's the disks or the controllers or the drivers which cause the duplication. Have a nice day :) Thomas
Re: smartctl cannot access my storage, need syntax help
Hi, Curt wrote: > I discovered a couple of discussions of the phenomenon, the upshot of which > were: > 1) That's what you get when you purchase cheap SSDs. > https://www.reddit.com/r/truenas/comments/s0rrpo/two_sata_ssds_with_identical_serial_numbers/ > 2) SSDs belonging to the same software RAID show identical serial numbers > in software, but these numbers don't match the serial numbers printed on the > SSDs themselves. > https://www.reddit.com/r/truenas/comments/s0rrpo/two_sata_ssds_with_identical_serial_numbers/ Those URLs are identical. (OMG ! Is it contageous ?) Number 2 would match my suspicion that some layer in the disk driving gets confused and mixes up the serial numbers. > But you said *similar*. By "colliding serial numbers" i mean indeed "identical serial numbers". How cheap the disks may ever be, that would be no excuse for not making them individually distinguishable. > As Gene's threads have too many movable parts > for me to follow, on that point I couldn't say. This one begins to gain presence in the web. So one can use search engines and AI to untangle its sub-threads. I meanwhile participate in two of them: serial number collision, rsync caused OOM killer (solved now, but how ?). Have a nice day :) Thomas
Re: smartctl cannot access my storage, need syntax help
On 1/17/24 06:18, gene heskett wrote: On 1/17/24 00:52, David Christensen wrote: I suggest removing one GST02TBG221146 and one GSTG02TB230206. Put them on the shelf, in other computer(s), or sell them. Then perhaps copying the /home RAID10 2 TB to one Gigastone 2 TB SSD would work. Or LABEL them. I suspect the conflicting serial numbers are causing problems in the kernel, as indicated by the /dev/disk/by-id/* problems. I would remove one each of the duplicate serial number disks to eliminate that possibility. David Cheers, Gene Heskett.
Re: smartctl cannot access my storage, need syntax help
On 1/16/24 23:46, Thomas Schmitt wrote: Gene Heskett wrote: One of these mails from a thread in december reveals that the three unique serial numbers GSTD02TB230102, GST02TBG221146, GSTG02TB230206 each come with a different version of "1C0", "7A0", "5A0", respectively. https://www.mail-archive.com/debian-user@lists.debian.org/msg799307.html That's unexpected, too, as the disk properties look identical elsewise. Thank you for locating the lshw(1) output. It appears to have been run when one Gigastone SSD was on the motherboard SATA controller and four Gigastone SSD's were on the 6-port HBA: 2024-01-17 08:58:54 dpchrist@laalaa ~ $ egrep 'sata|disk|product|version|serial' gene-heskett-coyote-lshw.out | grep -B 1 -A 2 Gigastone *-disk:1 product: Gigastone SSD version: 7A0 serial: GST02TBG221146 -- *-disk:0 product: Gigastone SSD version: 7A0 serial: GST02TBG221146 -- *-disk:1 product: Gigastone SSD version: 5A0 serial: GSTG02TB230206 -- *-disk:2 product: Gigastone SSD version: 5A0 serial: GSTG02TB230206 -- *-disk:3 product: Gigastone SSD version: 1C0 serial: GSTD02TB230102 David
Re: smartctl cannot access my storage, need syntax help
On 2024-01-17, Thomas Schmitt wrote: > > This is just weird. > I still have difficulties to believe that any disk manufacturer would > hand out disks with colliding serial numbers. I googled for this > phenomenon, but except two mails of Gene nothing similar popped up. I discovered a couple of discussions of the phenomenon, the upshot of which were: 1) That's what you get when you purchase cheap SSDs. https://www.reddit.com/r/truenas/comments/s0rrpo/two_sata_ssds_with_identical_serial_numbers/ 2) SSDs belonging to the same software RAID show identical serial numbers in software, but these numbers don't match the serial numbers printed on the SSDs themselves. https://www.reddit.com/r/truenas/comments/s0rrpo/two_sata_ssds_with_identical_serial_numbers/ But you said *similar*. As Gene's threads have too many movable parts for me to follow, on that point I couldn't say.
Re: smartctl cannot access my storage, need syntax help
Hi, after i began enumerating suspects, gene heskett wrote: > terminals scroll back memory, I purposely set this > particular terminals scrollback to 200 lines with that in mind. How large was it set when your runs caused the OOM killer to act ? I have a good number of xterms with 10,000 lines each. No tabs, no KDE, but 8 fvwm "desktops" (virtual screens) full of terminal windows. > > [Request to test the disks one-by-one on some other computer, whether > > they bear the same serial number at all controllers in all machines.] > Not as easily tried, the other 4 are in twin mounts in another portion of > the drive cages in this 30" tall tiger direct cage and not too readily > accessible w/o tipping the mobo out on its hinged mount. One should raise protest at Gigastone if the disks really have the same serial numbers. But before doing so, one would have to make sure that it is not some weird effect of them all being plugged into that machine at the same time. > One of you made the remark that seems to be the secret password. What did finally help ? Just the shorter terminal scroll back memory ? It would explain why a verbose rsync could summon the OOM killer always around the same stage of progress. But what waste of memory would have to happen with each of the rsync messages ? (You mentioned LABEL as a possibility. But not as actually used.) > Its still, slowly at 10 megs a second, working. I see in your previous mail rsync option --bwlimit=10m . But in the same mail there is an older quote from you that --bwlimit=3m only prolonged the time until the OOM killer appeared. So i wonder whether it would work at a more contemporary speed. Self-incrimination: The rest of this mail is off topic. > they gave all 7nth graders the Iowa > test in 1947, similar to the S/B IQ test but not copyrighted, there fore a > lot cheaper, and I came out of that with an equivalent of 147. I was tested in the 1960s but they did not tell the results to kids or parents. We only got recommendations at which of our three types of school we should continue at the age of 10 or 11 years. (So it was not to avoid discrimination of the dumb but rather to avoid that pupils feel more intelligent than their teachers.) Have a nice day :) Thomas
Re: smartctl cannot access my storage, need syntax help
On 1/17/24 02:42, Thomas Schmitt wrote: Hi, Gene Heskett wrote: lsblk, which I've published several times, shows 5 drives. Duh. Obviously this thread overstretches my mental capacity. And I've since tried cp in addition to rsync, does the same thing, killing the sysytem with the OOM but much quicker. cp using all system memory (32Gb) in 1 minute, another 500K into swap adds another 15 secs, and the OOM kills the system. So both cp and rsync act broken. I get the suspicion that your disk set overstretches the mental capacity of the hardware or the operating system. Both "cp" and "rsync" are heavily tested by the GNU/Linux community and quite independently developed. A common memory leak would have to sit deeper in the software stack, i.e. in kernel or firmware. kernel. firmware, or terminals scroll back memory, I purposely set this particular terminals scrollback to 200 lines with that in mind. rsync, with a --bwlimit=3m set, takes much longer to kill the system but the amount of data moved is very similar, 13.5G from clean disk to system freeze for rsync, 13.4G for cp. This observation might be significant. But i fail to make up a theory. One of the things I'm fairly good at, they gave all 7nth graders the Iowa test in 1947, similar to the S/B IQ test but not copyrighted, there fore a lot cheaper, and I came out of that with an equivalent of 147. I quit school 2 years later when I could and went to work fixing tv's. Had my draft number moved up in '52 in the middle of korea to get that out of the way, drafted was 2 years, volunteered was 4 years, but failed the AFQT by getting a 98 out of 100, which earned me a 4F classification because I wouldn't take orders from the Sargent, I find out the next best score that day among 130+ boys was 36/100 which freed me to let a girl become my wife in '57, & started making kids, got a 1st phone in 1962 without cracking a book, did the same thing in 1972 to become a registered CET which I'll readily admit is getting rusty in my dotage at 89 yo. The technology is slowly passing me by since I retired in the middle of 2002. Because I went diabetic in the '80's, my beer limit is 1, but I'd do it with any of you folks if we ever meet in person. Let the war stories flow. ;o)> <-smiley with a goatee. That copy is now up to 4x the data copied in any other try. root@coyote:~# df && free Filesystem 1K-blocks Used Available Use% Mounted on udev 16327704 0 16327704 0% /dev tmpfs 3272684 19043270780 1% /run /dev/sda1 863983352 22346308 797675396 3% / tmpfs16363420 1244 16362176 1% /dev/shm tmpfs5120 8 5112 1% /run/lock /dev/sda347749868 612 45291248 1% /tmp /dev/md0p1 1796382580 335101664 1369955940 20% /home tmpfs 3272684 37523268932 1% /run/user/1000 /dev/sdh1 1967892164 64369552 1803486364 4% /mnt/homevol totalusedfree shared buff/cache available Mem:32726840 3453372 199708 91904430336824 29273468 Swap: 1119027121536 111901176 And swap use has not increased, its stabilized. gene@coyote:~/src/klipper-docs$ lsblk -d -o NAME,MAJ:MIN,MODEL,SERIAL,WWN /dev/sd[hijkl] NAME MAJ:MIN MODEL SERIAL WWN sdh8:112 Gigastone SSD GSTD02TB230102 sdi8:128 Gigastone SSD GST02TBG221146 sdj8:144 Gigastone SSD GST02TBG221146 sdk8:160 Gigastone SSD GSTG02TB230206 sdl8:176 Gigastone SSD GSTG02TB230206 This is just weird. I still have difficulties to believe that any disk manufacturer would hand out disks with colliding serial numbers. I googled for this phenomenon, but except two mails of Gene nothing similar popped up. One of these mails from a thread in december reveals that the three unique serial numbers GSTD02TB230102, GST02TBG221146, GSTG02TB230206 each come with a different version of "1C0", "7A0", "5A0", respectively. Which is why, when I let my imagination out to play w/o a chaperone, my thoughts run toward some invented date code for a batch number. https://www.mail-archive.com/debian-user@lists.debian.org/msg799307.html That's unexpected, too, as the disk properties look identical elsewise. I guess that it is not possible to identify which disk came with which of the two separate purchases ? Once removed from the boxes, no. How many days were these purchases apart ? 6 weeks or so, as I formulated what to do next. But that isn't carved even in sandstone. David Christensen wrote: I suggest removing one GST02TBG221146 and one GSTG02TB230206. Put them on the shelf, in other computer(s), or sell them. Then perhaps copying the /home RAID10 2 TB to one Gigastone 2 TB SSD would work. I join this proposal. ... and dimly remember to have seen the proposal to attach the disks one by one without the other four, in order to see whether the
Re: smartctl cannot access my storage, need syntax help
On 1/17/24 00:52, David Christensen wrote: On 1/16/24 17:08, gene heskett wrote: > lsblk, which I've published several times, shows 5 drives. by-id listing > only shows 3. The drive I've been trying to use bounces from /dev/sdd to > sde to sdh dependin on which controller it is curently plugged into. > > And I've since tried cp in addition to rsync, does the same thing, > killing the sysytem with the OOM but much quicker. cp using all system > memory (32Gb) in 1 minute, another 500K into swap adds another 15 secs, > and the OOM kills the system. So both cp and rsync act broken. > > rsync, with a --bwlimit=3m set, takes much longer to kill the system but > the amount of data moved is very similar, 13.5G from clean disk to > system freeze for rsync, 13.4G for cp. On 1/16/24 18:10, gene heskett wrote: On 1/16/24 11:08, Thomas Schmitt wrote: ls -l /dev/sd[ij]* oot@coyote:~# ls -l /dev/sd[ij]* brw-rw 1 root disk 8, 128 Jan 16 05:01 /dev/sdi brw-rw 1 root disk 8, 129 Jan 16 05:01 /dev/sdi1 brw-rw 1 root disk 8, 144 Jan 16 05:01 /dev/sdj brw-rw 1 root disk 8, 145 Jan 16 05:01 /dev/sdj1 root@coyote:~# lsblk -d -o NAME,MAJ:MIN,MODEL,SERIAL,WWN /dev/sd[hijkl] gene@coyote:~/src/klipper-docs$ lsblk -d -o NAME,MAJ:MIN,MODEL,SERIAL,WWN /dev/sd[hijkl] NAME MAJ:MIN MODEL SERIAL WWN sdh 8:112 Gigastone SSD GSTD02TB230102 sdi 8:128 Gigastone SSD GST02TBG221146 sdj 8:144 Gigastone SSD GST02TBG221146 sdk 8:160 Gigastone SSD GSTG02TB230206 sdl 8:176 Gigastone SSD GSTG02TB230206 I suggest removing one GST02TBG221146 and one GSTG02TB230206. Put them on the shelf, in other computer(s), or sell them. Then perhaps copying the /home RAID10 2 TB to one Gigastone 2 TB SSD would work. David . Or LABEL them. And I seem to be making some progress this morning. opening a konsole and setting scrollback to 200 lines, limiting its use of memory, the tan memory bar in htop if full scale and it a couple megs into swap out of 107G. and the system still feels normal. in another multitabbed xfce4 shell, a "df && free" is showing this: root@coyote:~# df && free Filesystem 1K-blocks Used Available Use% Mounted on udev 16327704 0 16327704 0% /dev tmpfs 3272684 19043270780 1% /run /dev/sda1 863983352 22346276 797675428 3% / tmpfs16363420 1244 16362176 1% /dev/shm tmpfs5120 8 5112 1% /run/lock /dev/sda347749868 580 45291280 1% /tmp /dev/md0p1 1796382580 335100148 1369957456 20% /home tmpfs 3272684 37523268932 1% /run/user/1000 /dev/sdh1 1967892164 23830812 1844025104 2% /mnt/homevol totalusedfree shared buff/cache available Mem:32726840 3343048 218316 92219630443960 29383792 Swap: 1119027121536 111901176 root@coyote:~# rsync has been stopped and restarted, 4 times, but stopping it has not recovered the cache, so swap is increasing slowly. That faint knocking sound? Me, knocking on wood... ;o)> command line: rsync -a --bwlimit=10m --fsync --progress /home/ /mnt/homevol So we'll eventually either git-r-done or crask the system but this is farther than it ever got before in several days. Thanks everybody. Cheers, Gene Heskett. -- "There are four boxes to be used in defense of liberty: soap, ballot, jury, and ammo. Please use in that order." -Ed Howdershelt (Author, 1940) If we desire respect for the law, we must first make the law respectable. - Louis D. Brandeis
Re: smartctl cannot access my storage, need syntax help
Hi, Gene Heskett wrote: > lsblk, which I've published several times, shows 5 drives. Duh. Obviously this thread overstretches my mental capacity. > And I've since tried cp in addition to rsync, does the same thing, killing > the sysytem with the OOM but much quicker. cp using all system memory (32Gb) > in 1 minute, another 500K into swap adds another 15 secs, and the OOM kills > the system. So both cp and rsync act broken. I get the suspicion that your disk set overstretches the mental capacity of the hardware or the operating system. Both "cp" and "rsync" are heavily tested by the GNU/Linux community and quite independently developed. A common memory leak would have to sit deeper in the software stack, i.e. in kernel or firmware. > rsync, with a --bwlimit=3m set, takes much longer to kill the system but the > amount of data moved is very similar, 13.5G from clean disk to system freeze > for rsync, 13.4G for cp. This observation might be significant. But i fail to make up a theory. > gene@coyote:~/src/klipper-docs$ lsblk -d -o NAME,MAJ:MIN,MODEL,SERIAL,WWN > /dev/sd[hijkl] > NAME MAJ:MIN MODEL SERIAL WWN > sdh8:112 Gigastone SSD GSTD02TB230102 > sdi8:128 Gigastone SSD GST02TBG221146 > sdj8:144 Gigastone SSD GST02TBG221146 > sdk8:160 Gigastone SSD GSTG02TB230206 > sdl8:176 Gigastone SSD GSTG02TB230206 This is just weird. I still have difficulties to believe that any disk manufacturer would hand out disks with colliding serial numbers. I googled for this phenomenon, but except two mails of Gene nothing similar popped up. One of these mails from a thread in december reveals that the three unique serial numbers GSTD02TB230102, GST02TBG221146, GSTG02TB230206 each come with a different version of "1C0", "7A0", "5A0", respectively. https://www.mail-archive.com/debian-user@lists.debian.org/msg799307.html That's unexpected, too, as the disk properties look identical elsewise. I guess that it is not possible to identify which disk came with which of the two separate purchases ? How many days were these purchases apart ? David Christensen wrote: > I suggest removing one GST02TBG221146 and one GSTG02TB230206. Put them on > the shelf, in other computer(s), or sell them. Then perhaps copying the > /home RAID10 2 TB to one Gigastone 2 TB SSD would work. I join this proposal. ... and dimly remember to have seen the proposal to attach the disks one by one without the other four, in order to see whether the serial numbers are the same as with all five together. Since you got quite some hardware zoo: Consider to try the Gigastone disks with a different machine. Do the serial numbers show up as with the machine where you experience all those difficulties. Have a nice day :) Thomas
Re: smartctl cannot access my storage, need syntax help
On 1/16/24 17:08, gene heskett wrote: > lsblk, which I've published several times, shows 5 drives. by-id listing > only shows 3. The drive I've been trying to use bounces from /dev/sdd to > sde to sdh dependin on which controller it is curently plugged into. > > And I've since tried cp in addition to rsync, does the same thing, > killing the sysytem with the OOM but much quicker. cp using all system > memory (32Gb) in 1 minute, another 500K into swap adds another 15 secs, > and the OOM kills the system. So both cp and rsync act broken. > > rsync, with a --bwlimit=3m set, takes much longer to kill the system but > the amount of data moved is very similar, 13.5G from clean disk to > system freeze for rsync, 13.4G for cp. On 1/16/24 18:10, gene heskett wrote: On 1/16/24 11:08, Thomas Schmitt wrote: ls -l /dev/sd[ij]* oot@coyote:~# ls -l /dev/sd[ij]* brw-rw 1 root disk 8, 128 Jan 16 05:01 /dev/sdi brw-rw 1 root disk 8, 129 Jan 16 05:01 /dev/sdi1 brw-rw 1 root disk 8, 144 Jan 16 05:01 /dev/sdj brw-rw 1 root disk 8, 145 Jan 16 05:01 /dev/sdj1 root@coyote:~# lsblk -d -o NAME,MAJ:MIN,MODEL,SERIAL,WWN /dev/sd[hijkl] gene@coyote:~/src/klipper-docs$ lsblk -d -o NAME,MAJ:MIN,MODEL,SERIAL,WWN /dev/sd[hijkl] NAME MAJ:MIN MODEL SERIAL WWN sdh 8:112 Gigastone SSD GSTD02TB230102 sdi 8:128 Gigastone SSD GST02TBG221146 sdj 8:144 Gigastone SSD GST02TBG221146 sdk 8:160 Gigastone SSD GSTG02TB230206 sdl 8:176 Gigastone SSD GSTG02TB230206 I suggest removing one GST02TBG221146 and one GSTG02TB230206. Put them on the shelf, in other computer(s), or sell them. Then perhaps copying the /home RAID10 2 TB to one Gigastone 2 TB SSD would work. David
Re: smartctl cannot access my storage, need syntax help
gene heskett composed on 2024-01-16 20:08 (UTC-0500): > Felix Miata wrote: >> I straightened out the wrapping mess, and gave each entry a line number. I >> see >> nothing I recognize as representing serial number duplication among /dev/sdX >> (physical device) names: >> /dev/sda 9 /dev/disk/by-id/ata-Samsung_SSD_870_QVO_1TB_S5RRNF0T201730V >> /dev/sdd19 /dev/disk/by-id/ata-Samsung_SSD_870_EVO_1TB_S626NF0R302507V >> /dev/sde28 /dev/disk/by-id/ata-Samsung_SSD_870_EVO_1TB_S626NF0R302502E >> /dev/sdf36 /dev/disk/by-id/ata-Samsung_SSD_870_EVO_1TB_S626NF0R302498T >> /dev/sdg43 /dev/disk/by-id/ata-Samsung_SSD_870_EVO_1TB_S626NF0R302509W >> /dev/sdh51 /dev/disk/by-id/ata-Gigastone_SSD_GSTD02TB230102 >> /dev/sdi53 /dev/disk/by-id/ata-Gigastone_SSD_GST02TBG221146 >> /dev/sdk55 /dev/disk/by-id/ata-Gigastone_SSD_GSTG02TB230206 >> Exactly which line numbers represent duplication among the physical drives? > lsblk, which I've published several times, shows 5 drives. by-id listing > only shows 3. The drive I've been trying to use bounces from /dev/sdd to > sde to sdh dependin on which controller it is curently plugged into. >From your 2024-01-15 17:56 -0500 post, I see 8 unique serial numbers from SATA SSDs, 5 Samsung, 3 Gigastone. I ignore all your posts with lsblk that didn't use the -f option to facilitate identifying individual SSDs. > And I've since tried cp in addition to rsync, does the same thing, > killing the sysytem with the OOM but much quicker. cp using all system > memory (32Gb) in 1 minute, another 500K into swap adds another 15 secs, > and the OOM kills the system. So both cp and rsync act broken. > rsync, with a --bwlimit=3m set, takes much longer to kill the system but > the amount of data moved is very similar, 13.5G from clean disk to > system freeze for rsync, 13.4G for cp.-- Evolution as taught in public schools is, like religion, based on faith, not based on science. Team OS/2 ** Reg. Linux User #211409 ** a11y rocks! Felix Miata
Re: smartctl cannot access my storage, need syntax help
On 1/16/24 11:08, Thomas Schmitt wrote: ls -l /dev/sd[ij]* oot@coyote:~# ls -l /dev/sd[ij]* brw-rw 1 root disk 8, 128 Jan 16 05:01 /dev/sdi brw-rw 1 root disk 8, 129 Jan 16 05:01 /dev/sdi1 brw-rw 1 root disk 8, 144 Jan 16 05:01 /dev/sdj brw-rw 1 root disk 8, 145 Jan 16 05:01 /dev/sdj1 root@coyote:~# lsblk -d -o NAME,MAJ:MIN,MODEL,SERIAL,WWN /dev/sd[hijkl] gene@coyote:~/src/klipper-docs$ lsblk -d -o NAME,MAJ:MIN,MODEL,SERIAL,WWN /dev/sd[hijkl] NAME MAJ:MIN MODEL SERIAL WWN sdh8:112 Gigastone SSD GSTD02TB230102 sdi8:128 Gigastone SSD GST02TBG221146 sdj8:144 Gigastone SSD GST02TBG221146 sdk8:160 Gigastone SSD GSTG02TB230206 sdl8:176 Gigastone SSD GSTG02TB230206 note added l to get them all gene@coyote:~/src/klipper-docs$ Cheers, Gene Heskett. -- "There are four boxes to be used in defense of liberty: soap, ballot, jury, and ammo. Please use in that order." -Ed Howdershelt (Author, 1940) If we desire respect for the law, we must first make the law respectable. - Louis D. Brandeis
Re: smartctl cannot access my storage, need syntax help
On 1/16/24 06:09, Felix Miata wrote: Tom Furie composed on 2024-01-16 08:18 (UTC): Felix Miata writes: /dev/sdc 18 /dev/disk/by-id/usb-Brother_MFC-J6920DW_BROG5F229909-0:0 # How does a printer get a storage device assignment??? By having some kind of SD card slot or similar. So this pollution only results from a USB-connected printer? IP printer connections don't cause it too? Since I have one of the above printers it does indeed have an editable ipv4 address, but I don't generally use it as the usb2 is faster. Its been so long since I did use that interface that I do not recall if it listed the card memory. I'd expect it would since it can also to a free standing copy from its tabloid sized scanner. The printer can handle tabloid sized paper by hand feeding, so the copy function includes tabloid size too. Cheers, Gene Heskett. -- "There are four boxes to be used in defense of liberty: soap, ballot, jury, and ammo. Please use in that order." -Ed Howdershelt (Author, 1940) If we desire respect for the law, we must first make the law respectable. - Louis D. Brandeis
Re: smartctl cannot access my storage, need syntax help
On Tue 16 Jan 2024 at 20:08:12 (-0500), gene heskett wrote: > On 1/16/24 00:56, Felix Miata wrote: > > gene heskett composed on 2024-01-15 17:56 (UTC-0500): > > > > > Thanks for that composition: but it will be word wrapped: > > > root@coyote:~# for j in /dev/disk/by-id/* ; do printf '%s\t%s\n' > > > "$(realpath "$j")" "$j" ; done [ … ] > > I straightened out the wrapping mess, and gave each entry a line number. I > > see > > nothing I recognize as representing serial number duplication among /dev/sdX > > (physical device) names: > > [ … ] > > /dev/sdd19 /dev/disk/by-id/ata-Samsung_SSD_870_EVO_1TB_S626NF0R302507V > > /dev/sdd20 /dev/disk/by-id/wwn-0x5002538f413394ae > > /dev/sdd1 21 > > /dev/disk/by-id/ata-Samsung_SSD_870_EVO_1TB_S626NF0R302507V-part1 > > /dev/sdd1 22 /dev/disk/by-id/wwn-0x5002538f413394ae-part1 > > /dev/sdd2 23 > > /dev/disk/by-id/ata-Samsung_SSD_870_EVO_1TB_S626NF0R302507V-part2 > > /dev/sdd2 24 /dev/disk/by-id/wwn-0x5002538f413394ae-part2 > > /dev/sdd3 25 > > /dev/disk/by-id/ata-Samsung_SSD_870_EVO_1TB_S626NF0R302507V-part3 > > /dev/sdd3 26 /dev/disk/by-id/wwn-0x5002538f413394ae-part3 > > /dev/sde27 /dev/disk/by-id/wwn-0x5002538f413394a9 > > /dev/sde28 /dev/disk/by-id/ata-Samsung_SSD_870_EVO_1TB_S626NF0R302502E > > /dev/sde1 29 > > /dev/disk/by-id/ata-Samsung_SSD_870_EVO_1TB_S626NF0R302502E-part1 > > /dev/sde1 30 /dev/disk/by-id/wwn-0x5002538f413394a9-part1 > > /dev/sde2 31 > > /dev/disk/by-id/ata-Samsung_SSD_870_EVO_1TB_S626NF0R302502E-part2 > > /dev/sde2 32 /dev/disk/by-id/wwn-0x5002538f413394a9-part2 > > /dev/sde3 33 > > /dev/disk/by-id/ata-Samsung_SSD_870_EVO_1TB_S626NF0R302502E-part3 > > /dev/sde3 34 /dev/disk/by-id/wwn-0x5002538f413394a9-part3 > > /dev/sdf35 /dev/disk/by-id/wwn-0x5002538f413394a5 > > /dev/sdf36 /dev/disk/by-id/ata-Samsung_SSD_870_EVO_1TB_S626NF0R302498T > > /dev/sdf1 37 > > /dev/disk/by-id/ata-Samsung_SSD_870_EVO_1TB_S626NF0R302498T-part1 > > /dev/sdf1 38 /dev/disk/by-id/wwn-0x5002538f413394a5-part1 > > /dev/sdf2 39 > > /dev/disk/by-id/ata-Samsung_SSD_870_EVO_1TB_S626NF0R302498T-part2 > > /dev/sdf2 40 /dev/disk/by-id/wwn-0x5002538f413394a5-part2 > > /dev/sdf3 41 > > /dev/disk/by-id/ata-Samsung_SSD_870_EVO_1TB_S626NF0R302498T-part3 > > /dev/sdf3 42 /dev/disk/by-id/wwn-0x5002538f413394a5-part3 > > /dev/sdg43 /dev/disk/by-id/ata-Samsung_SSD_870_EVO_1TB_S626NF0R302509W > > /dev/sdg44 /dev/disk/by-id/wwn-0x5002538f413394b0 > > /dev/sdg1 45 > > /dev/disk/by-id/ata-Samsung_SSD_870_EVO_1TB_S626NF0R302509W-part1 > > /dev/sdg1 46 /dev/disk/by-id/wwn-0x5002538f413394b0-part1 > > /dev/sdg2 47 > > /dev/disk/by-id/ata-Samsung_SSD_870_EVO_1TB_S626NF0R302509W-part2 > > /dev/sdg2 48 /dev/disk/by-id/wwn-0x5002538f413394b0-part2 > > /dev/sdg3 49 > > /dev/disk/by-id/ata-Samsung_SSD_870_EVO_1TB_S626NF0R302509W-part3 > > /dev/sdg3 50 /dev/disk/by-id/wwn-0x5002538f413394b0-part3 > lsblk, which I've published several times, shows 5 drives. by-id > listing only shows 3. The drive I've been trying to use bounces from > /dev/sdd to sde to sdh dependin on which controller it is curently > plugged into. I take it that you're trying to copy to one Gigastone SSD. Presumably the kernel favours some controllers over others in the race to name them. This is why using the kernel's device names is no longer recommended. > And I've since tried cp in addition to rsync, does the same thing, > killing the sysytem with the OOM but much quicker. cp using all system > memory (32Gb) in 1 minute, another 500K into swap adds another 15 > secs, and the OOM kills the system. So both cp and rsync act broken. I'd be tempted to bisect the problem by copying to another machine though a cat5 cable. > rsync, with a --bwlimit=3m set, takes much longer to kill the system > but the amount of data moved is very similar, 13.5G from clean disk to > system freeze for rsync, 13.4G for cp. I don't know enough about how rsync behaves to interpret that coincidence, but it seems ominous on its face. Cheers, David.
Re: smartctl cannot access my storage, need syntax help
On 1/16/24 01:18, Felix Miata wrote: Felix Miata composed on 2024-01-16 01:05 (UTC-0500): gene heskett composed on 2024-01-15 18:37 (UTC-0500): Ah,but I finally glombed onto the bug tan memory bar in htop as it was runniing, someplace in the data chain is a huge memory leak, my crash is caused by the OOM daemon killing things. And it only occurs when I run rsync. Only takes it 10 minute to eat 32G of memory, then 500k into swap, and the OOM daemon start killing the system until there's nothing left to run. What does free report before starting rsync? Do you have all your swap on a partition? Do you have any swapspace? I would log out of XFCE, login on a vtty to open top, then login on another to try to run rsync. If that fails OOM too, since the target is ostensibly starting from scratch, use MC, and divide the job into the source's directories if necessary. MC gets rather bogged down if you try to do a bazillion individual files in a single copy operation. Trying to think outside the box, something else to think about, from the man page: [quote] --archive, -a This is equivalent to -rlptgoD. It is a quick way of saying you want recursion and want to preserve almost everything. Be aware that it does not include preserving ACLs (-A), xattrs (-X), atimes (-U), crtimes (-N), nor the finding and preserving of hardlinks (-H). [/quote] If rsync really is bugged, maybe a change of options would avoid the bug. Try instead of -av, -rlptgoDAXUNH. Could it be that verbosity is the OOM crippler, and not necessarily from rsync itself, but possibly from the xterm in which rsync is running? Does your source contain any hard links? Do you use ACLs or xattrs? unreported here because it didn't seem to have any effect, I've tried to test that theory by clearing the back-trace buffer at 30 second intervals. no obviously detectable effect, untested is setting that back to a 1000 line default. And since I've driven around 170 miles in poor visibility bad weather today, no more tests will be done tonight, I'm not the 16 years old I was when I learned to drive 70 mph on even worse roads 75 years ago. So I'll sign off shortly. Cheers, Gene Heskett. -- "There are four boxes to be used in defense of liberty: soap, ballot, jury, and ammo. Please use in that order." -Ed Howdershelt (Author, 1940) If we desire respect for the law, we must first make the law respectable. - Louis D. Brandeis
Re: smartctl cannot access my storage, need syntax help
On 1/16/24 01:05, Felix Miata wrote: gene heskett composed on 2024-01-15 18:37 (UTC-0500): Ah,but I finally glombed onto the bug tan memory bar in htop as it was runniing, someplace in the data chain is a huge memory leak, my crash is caused by the OOM daemon killing things. And it only occurs when I run rsync. Only takes it 10 minute to eat 32G of memory, then 500k into swap, and the OOM daemon start killing the system until there's nothing left to run. What does free report before starting rsync? Do you have all your swap on a partition? Do you have any swapspace? Actually, swap is in 2 locations, one is a swap-dir on /dev/sda, 47G IIRC, and 60G on md1. Shows in htop as 107G total. I would log out of XFCE, login on a vtty to open top, then login on another to try to run rsync. If that fails OOM too, since the target is ostensibly starting from scratch, use MC, and divide the job into the source's directories if necessary. MC gets rather bogged down if you try to do a bazillion individual files in a single copy operation. True, but I don't recall it ever failing Cheers, Gene Heskett. -- "There are four boxes to be used in defense of liberty: soap, ballot, jury, and ammo. Please use in that order." -Ed Howdershelt (Author, 1940) If we desire respect for the law, we must first make the law respectable. - Louis D. Brandeis
Re: smartctl cannot access my storage, need syntax help
On 1/16/24 00:56, Felix Miata wrote: gene heskett composed on 2024-01-15 17:56 (UTC-0500): Thanks for that composition: but it will be word wrapped: root@coyote:~# for j in /dev/disk/by-id/* ; do printf '%s\t%s\n' "$(realpath "$j")" "$j" ; done /dev/sr0/dev/disk/by-id/ata-ATAPI_iHAS424_B_3524253_327133504865 /dev/sdi/dev/disk/by-id/ata-Gigastone_SSD_GST02TBG221146 /dev/sdj1 /dev/disk/by-id/ata-Gigastone_SSD_GST02TBG221146-part1 /dev/sdh/dev/disk/by-id/ata-Gigastone_SSD_GSTD02TB230102 /dev/sdh1 /dev/disk/by-id/ata-Gigastone_SSD_GSTD02TB230102-part1 /dev/sdk/dev/disk/by-id/ata-Gigastone_SSD_GSTG02TB230206 /dev/sdk1 /dev/disk/by-id/ata-Gigastone_SSD_GSTG02TB230206-part1 /dev/sdf/dev/disk/by-id/ata-Samsung_SSD_870_EVO_1TB_S626NF0R302498T /dev/sdf1 /dev/disk/by-id/ata-Samsung_SSD_870_EVO_1TB_S626NF0R302498T-part1 /dev/sdf2 /dev/disk/by-id/ata-Samsung_SSD_870_EVO_1TB_S626NF0R302498T-part2 /dev/sdf3 /dev/disk/by-id/ata-Samsung_SSD_870_EVO_1TB_S626NF0R302498T-part3 /dev/sde/dev/disk/by-id/ata-Samsung_SSD_870_EVO_1TB_S626NF0R302502E /dev/sde1 /dev/disk/by-id/ata-Samsung_SSD_870_EVO_1TB_S626NF0R302502E-part1 /dev/sde2 /dev/disk/by-id/ata-Samsung_SSD_870_EVO_1TB_S626NF0R302502E-part2 /dev/sde3 /dev/disk/by-id/ata-Samsung_SSD_870_EVO_1TB_S626NF0R302502E-part3 /dev/sdd/dev/disk/by-id/ata-Samsung_SSD_870_EVO_1TB_S626NF0R302507V /dev/sdd1 /dev/disk/by-id/ata-Samsung_SSD_870_EVO_1TB_S626NF0R302507V-part1 /dev/sdd2 /dev/disk/by-id/ata-Samsung_SSD_870_EVO_1TB_S626NF0R302507V-part2 /dev/sdd3 /dev/disk/by-id/ata-Samsung_SSD_870_EVO_1TB_S626NF0R302507V-part3 /dev/sdg/dev/disk/by-id/ata-Samsung_SSD_870_EVO_1TB_S626NF0R302509W /dev/sdg1 /dev/disk/by-id/ata-Samsung_SSD_870_EVO_1TB_S626NF0R302509W-part1 /dev/sdg2 /dev/disk/by-id/ata-Samsung_SSD_870_EVO_1TB_S626NF0R302509W-part2 /dev/sdg3 /dev/disk/by-id/ata-Samsung_SSD_870_EVO_1TB_S626NF0R302509W-part3 /dev/sda/dev/disk/by-id/ata-Samsung_SSD_870_QVO_1TB_S5RRNF0T201730V /dev/sda1 /dev/disk/by-id/ata-Samsung_SSD_870_QVO_1TB_S5RRNF0T201730V-part1 /dev/sda2 /dev/disk/by-id/ata-Samsung_SSD_870_QVO_1TB_S5RRNF0T201730V-part2 /dev/sda3 /dev/disk/by-id/ata-Samsung_SSD_870_QVO_1TB_S5RRNF0T201730V-part3 /dev/md0/dev/disk/by-id/md-name-coyote:0 /dev/md0p1 /dev/disk/by-id/md-name-coyote:0-part1 /dev/md2/dev/disk/by-id/md-name-coyote:2 /dev/md1/dev/disk/by-id/md-name-_none_:1 /dev/md0/dev/disk/by-id/md-uuid-3d5a3621:c0e32c8a:e3f7ebb3:318edbfb /dev/md0p1 /dev/disk/by-id/md-uuid-3d5a3621:c0e32c8a:e3f7ebb3:318edbfb-part1 /dev/md1/dev/disk/by-id/md-uuid-57a88605:27f5a773:5be347c1:7c5e7342 /dev/md2/dev/disk/by-id/md-uuid-bb6e03ce:19d290c8:5171004f:0127a392 /dev/sdc/dev/disk/by-id/usb-Brother_MFC-J6920DW_BROG5F229909-0:0 /dev/sdb/dev/disk/by-id/usb-USB_Mass_Storage_Device_816820130806-0:0 /dev/sdf/dev/disk/by-id/wwn-0x5002538f413394a5 /dev/sdf1 /dev/disk/by-id/wwn-0x5002538f413394a5-part1 /dev/sdf2 /dev/disk/by-id/wwn-0x5002538f413394a5-part2 /dev/sdf3 /dev/disk/by-id/wwn-0x5002538f413394a5-part3 /dev/sde/dev/disk/by-id/wwn-0x5002538f413394a9 /dev/sde1 /dev/disk/by-id/wwn-0x5002538f413394a9-part1 /dev/sde2 /dev/disk/by-id/wwn-0x5002538f413394a9-part2 /dev/sde3 /dev/disk/by-id/wwn-0x5002538f413394a9-part3 /dev/sdd/dev/disk/by-id/wwn-0x5002538f413394ae /dev/sdd1 /dev/disk/by-id/wwn-0x5002538f413394ae-part1 /dev/sdd2 /dev/disk/by-id/wwn-0x5002538f413394ae-part2 /dev/sdd3 /dev/disk/by-id/wwn-0x5002538f413394ae-part3 /dev/sdg/dev/disk/by-id/wwn-0x5002538f413394b0 /dev/sdg1 /dev/disk/by-id/wwn-0x5002538f413394b0-part1 /dev/sdg2 /dev/disk/by-id/wwn-0x5002538f413394b0-part2 /dev/sdg3 /dev/disk/by-id/wwn-0x5002538f413394b0-part3 /dev/sda/dev/disk/by-id/wwn-0x5002538f42205e8e /dev/sda1 /dev/disk/by-id/wwn-0x5002538f42205e8e-part1 /dev/sda2 /dev/disk/by-id/wwn-0x5002538f42205e8e-part2 /dev/sda3 /dev/disk/by-id/wwn-0x5002538f42205e8e-part3 root@coyote:~# but like I wrote, 2 pairs with identical "serial numbers", so the assunption is that the last one overwrites the first on by udev, when IMO it should be yelling about the duplicats. I straightened out the wrapping mess, and gave each entry a line number. I see nothing I recognize as representing serial number duplication among /dev/sdX (physical device) names: /dev/md0 1 /dev/disk/by-id/md-name-coyote:0 /dev/md0 2 /dev/disk/by-id/md-uuid-3d5a3621:c0e32c8a:e3f7ebb3:318edbfb /dev/md0p1 3 /dev/disk/by-id/md-name-coyote:0-part1 /dev/md0p1 4 /dev/disk/by-id/md-uuid-3d5a3621:c0e32c8a:e3f7ebb3:318edbfb-part1 /dev/md1 5 /dev/disk/by-id/md-name-_none_:1 /dev/md1 6 /dev/disk/by-id/md-uuid-57a88605:27f5a773:5be347c1:7c5e7342 /dev/md2 7 /dev/disk/by-id/md-name-coyote:2 /dev/md2 8 /dev/disk/by-id/md-uuid-bb6e03ce:19d290c8:5171004f:0127a392
Re: smartctl cannot access my storage, need syntax help
On Tue 16 Jan 2024 at 06:08:35 (-0500), Felix Miata wrote: > Tom Furie composed on 2024-01-16 08:18 (UTC): > > Felix Miata writes: > > >> /dev/sdc 18 /dev/disk/by-id/usb-Brother_MFC-J6920DW_BROG5F229909-0:0 # > >> How does a printer get a storage device assignment??? > > > By having some kind of SD card slot or similar. > > So this pollution only results from a USB-connected printer? IP printer > connections don't cause it too? AIUI (not very well), you only get a /dev/sdX when the linux kernel is what's writing the blocks on the filesystem. So when I plug in my Galaxy 4 mobile and tap the appropriate buttons on its screen, /dev/sdb{,1} appear as a block device and partition: sdb 8:16 1 29.7G 0 disk └─sdb18:17 1 29.7G 0 part so I can run fdisk on the SD card while in the phone, for example: $ sudo fdisk -l /dev/sdb Disk /dev/sdb: 29.72 GiB, 31914983424 bytes, 62333952 sectors Disk model: S5360 Card Units: sectors of 1 * 512 = 512 bytes Sector size (logical/physical): 512 bytes / 512 bytes I/O size (minimum/optimal): 512 bytes / 512 bytes Disklabel type: dos Disk identifier: 0x03399e11 Device Boot Start End Sectors Size Id Type /dev/sdb12048 62333951 62331904 29.7G c W95 FAT32 (LBA) $ OTOH with my A13 phone, I don't get a block device created, but just a FUSE wrapper round the filesystems that Android is running, both internal and any SD card: $ mount [ … ] aft-mtp-mount on /media/samsungd type fuse.aft-mtp-mount (rw,nosuid,nodev,relatime,user_id=1000,group_id=1000) $ Cheers, David.
Re: smartctl cannot access my storage, need syntax help
On Tue 16 Jan 2024 at 09:40:19 (-0500), Greg Wooledge wrote: > On Tue, Jan 16, 2024 at 09:31:54AM -0500, Felix Miata wrote: > > David Wright composed on 2024-01-16 08:05 (UTC-0600): > > > On Tue 16 Jan 2024 at 00:55:52 (-0500), Felix Miata wrote: > > >> gene heskett composed on 2024-01-15 17:56 (UTC-0500): > > > > >>> Thanks for that composition: but it will be word wrapped: > > >>> root@coyote:~# for j in /dev/disk/by-id/* ; do printf '%s\t%s\n' > > >>> "$(realpath "$j")" "$j" ; done > > >>> /dev/sr0/dev/disk/by-id/ata-ATAPI_iHAS424_B_3524253_327133504865 > > >>> /dev/sdi/dev/disk/by-id/ata-Gigastone_SSD_GST02TBG221146 > > >>> /dev/sdj1 /dev/disk/by-id/ata-Gigastone_SSD_GST02TBG221146-part1 > > > > > It's right here at the top. > > > > I missed that, probably because i & j look similar in the big sea of > > alphanumerics, /and/ sdi has no partitions, while sdj1 has no parent disk. > > That > > seems to smell as much like a bug somewhere as two different disks with the > > same > > serial number, a cheap SATA port card maybe. Does ...1146 get duplication > > like > > that when connected to any/every available SATA port? > > I missed it too. It actually looks like someone copy/pasted the > pathnames on the right, but then manually typed the device names on > the left, and made a typo here. Or, somehow, the device names and > the pathnames got mixed together, and someone tried to separate them > manually, and got these two crossed. It's the sticky labels that convinced me. I had one last possibility in mind, that the serial numbers were being generated by the interfaces somehow, but they wouldn't be able to read the labels. I know nothing about Gene's interfaces, but my SD cards can appear with false by-id/ values depending on where they're plugged in: slots (on different PCs), via µSD-SD adapter, SD-USB adapter, etc. Cheers, David.
To partition or not to partition MD arrays (Was Re: smartctl cannot access my storage, need syntax help)
Hello, On Tue, Jan 16, 2024 at 01:01:02PM -0800, David Christensen wrote: > On 1/16/24 11:51, Franco Martelli wrote: > > I thought it was mandatory for a RAID to partition drives with this > > partition type, am I wrong? In the ancient past it was required, because that was one of the ways that mdadm arrays were assembled: the md kernel driver saw the "LInux RAID" partition types and tried using them. If you weren't going to do that, you had to have an mdadm config file, or ewven specify the topology on the kernel command line. This was 15 or more years ago. Ever since udev, each newly-appearing block device is handed to a script for incremental assembly based on the md metadata on the device itself, so any kind of block device will do. > As I switched from mdadm(8) to zfs(8) years ago, perhaps another > reader can explain what mdadm(8) does when given whole disks and > when given disk partitions. mdadm doesn't care. The older set of people recommending partitions were because drive capacities used to vary quite a lot more than they do today. So people used to say, "put a partition on and make it few hundred MB less than the total size of the drive, then if you have to replace it with a slightly smaller one you'll be fine." Since 2005 or so there has been a standard called IDEMA LBA1-03¹ about what the actual capacity in sectors should be for any stated drive capacity, and most drives obey this, though there are still a few exceptions. So this is very much less of a concern, especially for those buying "enterprise" storage. The newer set of people recommending partitions are mostly doing so because there's been a few incidents of "helpful" PC motherboards detecting on boot what they think is a corrupt GPT, and replacing it with a blank one, damaging the RAID. This is a real thing that has happened to more than one person; it even got linked on Hacker News I believe. Then there will just be people going by taste. Personally I still put them directly on drives. If I ever get taken out by one of those crappy motherboards, I reserve the right to get a different religion. Thanks, Andy ¹ https://idema.org/wp-content/downloads/2169.pdf -- https://bitfolk.com/ -- No-nonsense VPS hosting
Re: smartctl cannot access my storage, need syntax help
On 1/16/24 11:51, Franco Martelli wrote: On 15/01/24 at 08:43, David Christensen wrote: When I built and ran a Debian 2 @ HDD RAID1 using mdadm(8), I did not partiton the HDD's -- I gave mdadm(8) the whole drives. I don't know if it is a good idea, in fact it exists a special partition type for RAID array listed in fdisk, I used that for my RAID: --- ~# fdisk -l /dev/sd[a-d] Disk /dev/sda: 931,51 GiB, 1000204886016 bytes, 1953525168 sectors Disk model: ST1000DM003-1CH1 Units: sectors of 1 * 512 = 512 bytes Sector size (logical/physical): 512 bytes / 4096 bytes I/O size (minimum/optimal): 4096 bytes / 4096 bytes Disklabel type: dos Disk identifier: 0x00088ecc > ... I thought it was mandatory for a RAID to partition drives with this partition type, am I wrong? STFW and RTFM I have seen recommendations for and against using whole disks for RAID and for and against using partitions for RAID. And, as this in the Internet, there are countless rumors and speculation. As I switched from mdadm(8) to zfs(8) years ago, perhaps another reader can explain what mdadm(8) does when given whole disks and when given disk partitions. David
Re: smartctl cannot access my storage, need syntax help
On 15/01/24 at 08:43, David Christensen wrote: This I am still trying to do, the first pass copied all 350G of /home but went to the wrong drive, and I had mounted the drive by its label. It is now /dev/sdh and all labels above it are now wrong. Crazy. These SSD's all have an OTP serial number. I am tempted to use that serial number as a label _I_ can control. When I built and ran a Debian 2 @ HDD RAID1 using mdadm(8), I did not partiton the HDD's -- I gave mdadm(8) the whole drives. I don't know if it is a good idea, in fact it exists a special partition type for RAID array listed in fdisk, I used that for my RAID: --- ~# fdisk -l /dev/sd[a-d] Disk /dev/sda: 931,51 GiB, 1000204886016 bytes, 1953525168 sectors Disk model: ST1000DM003-1CH1 Units: sectors of 1 * 512 = 512 bytes Sector size (logical/physical): 512 bytes / 4096 bytes I/O size (minimum/optimal): 4096 bytes / 4096 bytes Disklabel type: dos Disk identifier: 0x00088ecc Device Boot StartEndSectors Size Id Type /dev/sda1 * 2048 1953523711 1953521664 931,5G fd Linux raid autodetect Disk /dev/sdb: 931,51 GiB, 1000204886016 bytes, 1953525168 sectors Disk model: ST1000DM003-1CH1 Units: sectors of 1 * 512 = 512 bytes Sector size (logical/physical): 512 bytes / 4096 bytes I/O size (minimum/optimal): 4096 bytes / 4096 bytes Disklabel type: dos Disk identifier: 0x000d65c9 Device Boot StartEndSectors Size Id Type /dev/sdb12048 1953523711 1953521664 931,5G fd Linux raid autodetect Disk /dev/sdc: 931,51 GiB, 1000204886016 bytes, 1953525168 sectors Disk model: ST1000DM003-1CH1 Units: sectors of 1 * 512 = 512 bytes Sector size (logical/physical): 512 bytes / 4096 bytes I/O size (minimum/optimal): 4096 bytes / 4096 bytes Disklabel type: dos Disk identifier: 0x000306a3 Device Boot StartEndSectors Size Id Type /dev/sdc12048 1953523711 1953521664 931,5G fd Linux raid autodetect Disk /dev/sdd: 931,51 GiB, 1000204886016 bytes, 1953525168 sectors Disk model: ST1000DM003-1CH1 Units: sectors of 1 * 512 = 512 bytes Sector size (logical/physical): 512 bytes / 4096 bytes I/O size (minimum/optimal): 4096 bytes / 4096 bytes Disklabel type: dos Disk identifier: 0x0007a1fe Device Boot StartEndSectors Size Id Type /dev/sdd12048 1953523711 1953521664 931,5G fd Linux raid autodetect --- I thought it was mandatory for a RAID to partition drives with this partition type, am I wrong? Cheers, -- Franco Martelli
Re: smartctl cannot access my storage, need syntax help
Hello, On Tue, Jan 16, 2024 at 01:17:42AM -0500, Felix Miata wrote: > If rsync really is bugged, maybe a change of options would avoid the bug. Try > instead of -av, -rlptgoDAXUNH. Could it be that verbosity is the OOM > crippler, and > not necessarily from rsync itself, but possibly from the xterm in which rsync > is > running? Does your source contain any hard links? Do you use ACLs or xattrs? I'm totally burned out on trying to get info out of Gene, but my experience with rsync is that use of some options can massively increase memory usage. The options covered by -a don't tend to do it (and I doubt -v does anything), but things like --delay-updates, --delete--before, --delete-after and --prune-empty-dirs do. This is because rsync normally incrementally finds files to transfer so it only keeps a certain number of entries in memory and can sync any number of files without blowing up RAM, but those options disable that strategy. Even so, rsync only needs about 100 bytes of RAM per file that is checked on source, and the size of the files doesn't matter. In desperate circumstances, file tree can be rsynced in multiple segments, e.g. one rsync for each subdir or whatever other split makes sense. Maybe also ulimit can be used to set an artificially low value on the memory that rsync is allowed to use. It will fail sooner, but hopefully before using all the system's RAM and swap and having the oom-killer intervene. Thanks, Andy -- https://bitfolk.com/ -- No-nonsense VPS hosting
Re: smartctl cannot access my storage, need syntax help
Hi, i, too, wondered where there should be a duplicate serial number. But indeed: David Wright wrote: > > /dev/sdi53 /dev/disk/by-id/ata-Gigastone_SSD_GST02TBG221146 > > /dev/sdj1 54 /dev/disk/by-id/ata-Gigastone_SSD_GST02TBG221146-part1 > ↑ that is /really/ bad! Does the number of 4 device files /dev/sd[h-k] match the number of installed ata-Gigastone_SSD devices ? Gene talked of "5, ordered in 2 separate orders". (Looking at https://lists.debian.org/debian-user/2024/01/msg00667.html) Now we see 3 to 4, depending on what one wants to believe. Wild ideas: One possible reason could be that a device is mapped to both, /dev/sdi and /dev/sdj. udev would then suffer a race condition when creating the /dev/disk/by-id. Another could be that udev's assessment of the drives derails and that serial number information spilled from the assessment of /dev/sdi to the assessment of /dev/sdj*. It would be interesting to see the output of ls -l /dev/sd[ij]* in order to learn about the existence of /dev/sdj and the the device numbers of sdi* and sdj*. Further one should inquire the serial numbers by lsblk -d -o NAME,MAJ:MIN,MODEL,SERIAL,WWN /dev/sd[hijk] Have a nice day :) Thomas
Re: smartctl cannot access my storage, need syntax help
On 16/01/2024 15:18, Tom Furie wrote: /dev/sdc 18 /dev/disk/by-id/usb-Brother_MFC-J6920DW_BROG5F229909-0:0 # How does a printer get a storage device assignment??? By having some kind of SD card slot or similar. I have heard that some devices expose a USB mass storage interface out of the box to autorun an installer when the device is plugged. Finally the installer switches the device to its normal mode. On Linux usb-modeswitch might be required.
Re: smartctl cannot access my storage, need syntax help
On Tue, Jan 16, 2024 at 09:31:54AM -0500, Felix Miata wrote: > David Wright composed on 2024-01-16 08:05 (UTC-0600): > > > On Tue 16 Jan 2024 at 00:55:52 (-0500), Felix Miata wrote: > > >> gene heskett composed on 2024-01-15 17:56 (UTC-0500): > > >>> Thanks for that composition: but it will be word wrapped: > >>> root@coyote:~# for j in /dev/disk/by-id/* ; do printf '%s\t%s\n' > >>> "$(realpath "$j")" "$j" ; done > >>> /dev/sr0/dev/disk/by-id/ata-ATAPI_iHAS424_B_3524253_327133504865 > >>> /dev/sdi/dev/disk/by-id/ata-Gigastone_SSD_GST02TBG221146 > >>> /dev/sdj1 /dev/disk/by-id/ata-Gigastone_SSD_GST02TBG221146-part1 > > > It's right here at the top. > > I missed that, probably because i & j look similar in the big sea of > alphanumerics, /and/ sdi has no partitions, while sdj1 has no parent disk. > That > seems to smell as much like a bug somewhere as two different disks with the > same > serial number, a cheap SATA port card maybe. Does ...1146 get duplication like > that when connected to any/every available SATA port? I missed it too. It actually looks like someone copy/pasted the pathnames on the right, but then manually typed the device names on the left, and made a typo here. Or, somehow, the device names and the pathnames got mixed together, and someone tried to separate them manually, and got these two crossed.
Re: smartctl cannot access my storage, need syntax help
David Wright composed on 2024-01-16 08:05 (UTC-0600): > On Tue 16 Jan 2024 at 00:55:52 (-0500), Felix Miata wrote: >> gene heskett composed on 2024-01-15 17:56 (UTC-0500): >>> Thanks for that composition: but it will be word wrapped: >>> root@coyote:~# for j in /dev/disk/by-id/* ; do printf '%s\t%s\n' >>> "$(realpath "$j")" "$j" ; done >>> /dev/sr0/dev/disk/by-id/ata-ATAPI_iHAS424_B_3524253_327133504865 >>> /dev/sdi/dev/disk/by-id/ata-Gigastone_SSD_GST02TBG221146 >>> /dev/sdj1 /dev/disk/by-id/ata-Gigastone_SSD_GST02TBG221146-part1 > It's right here at the top. I missed that, probably because i & j look similar in the big sea of alphanumerics, /and/ sdi has no partitions, while sdj1 has no parent disk. That seems to smell as much like a bug somewhere as two different disks with the same serial number, a cheap SATA port card maybe. Does ...1146 get duplication like that when connected to any/every available SATA port? -- Evolution as taught in public schools is, like religion, based on faith, not based on science. Team OS/2 ** Reg. Linux User #211409 ** a11y rocks! Felix Miata
Re: smartctl cannot access my storage, need syntax help
On Tue 16 Jan 2024 at 00:55:52 (-0500), Felix Miata wrote: > gene heskett composed on 2024-01-15 17:56 (UTC-0500): > > > Thanks for that composition: but it will be word wrapped: > > root@coyote:~# for j in /dev/disk/by-id/* ; do printf '%s\t%s\n' > > "$(realpath "$j")" "$j" ; done > > /dev/sr0/dev/disk/by-id/ata-ATAPI_iHAS424_B_3524253_327133504865 > > /dev/sdi/dev/disk/by-id/ata-Gigastone_SSD_GST02TBG221146 > > /dev/sdj1 /dev/disk/by-id/ata-Gigastone_SSD_GST02TBG221146-part1 It's right here at the top. [ … ] > > root@coyote:~# > > but like I wrote, 2 pairs with identical "serial numbers", so the > > assunption is that the last one overwrites the first on by udev, when > > IMO it should be yelling about the duplicats. > > I straightened out the wrapping mess, and gave each entry a line number. I see > nothing I recognize as representing serial number duplication among /dev/sdX > (physical device) names: > > /dev/md0 1 /dev/disk/by-id/md-name-coyote:0 > /dev/md0 2 /dev/disk/by-id/md-uuid-3d5a3621:c0e32c8a:e3f7ebb3:318edbfb > /dev/md0p1 3 /dev/disk/by-id/md-name-coyote:0-part1 > /dev/md0p1 4 > /dev/disk/by-id/md-uuid-3d5a3621:c0e32c8a:e3f7ebb3:318edbfb-part1 > /dev/md1 5 /dev/disk/by-id/md-name-_none_:1 > /dev/md1 6 /dev/disk/by-id/md-uuid-57a88605:27f5a773:5be347c1:7c5e7342 > /dev/md2 7 /dev/disk/by-id/md-name-coyote:2 > /dev/md2 8 /dev/disk/by-id/md-uuid-bb6e03ce:19d290c8:5171004f:0127a392 > /dev/sda 9 /dev/disk/by-id/ata-Samsung_SSD_870_QVO_1TB_S5RRNF0T201730V > /dev/sda10 /dev/disk/by-id/wwn-0x5002538f42205e8e > /dev/sda1 11 > /dev/disk/by-id/ata-Samsung_SSD_870_QVO_1TB_S5RRNF0T201730V-part1 > /dev/sda1 12 /dev/disk/by-id/wwn-0x5002538f42205e8e-part1 > /dev/sda2 13 > /dev/disk/by-id/ata-Samsung_SSD_870_QVO_1TB_S5RRNF0T201730V-part2 > /dev/sda2 14 /dev/disk/by-id/wwn-0x5002538f42205e8e-part2 > /dev/sda3 15 > /dev/disk/by-id/ata-Samsung_SSD_870_QVO_1TB_S5RRNF0T201730V-part3 > /dev/sda3 16 /dev/disk/by-id/wwn-0x5002538f42205e8e-part3 > /dev/sdb17 /dev/disk/by-id/usb-USB_Mass_Storage_Device_816820130806-0:0 > /dev/sdc18 /dev/disk/by-id/usb-Brother_MFC-J6920DW_BROG5F229909-0:0 > # How does a printer get a storage device assignment??? I'd /love/ my HP8500 scanner (that's all that works now) to have the USB stick (which I scan onto) be visible to a connected PC or the network. Is that what /dev/sdc is? > /dev/sdd19 /dev/disk/by-id/ata-Samsung_SSD_870_EVO_1TB_S626NF0R302507V > /dev/sdd20 /dev/disk/by-id/wwn-0x5002538f413394ae > /dev/sdd1 21 > /dev/disk/by-id/ata-Samsung_SSD_870_EVO_1TB_S626NF0R302507V-part1 > /dev/sdd1 22 /dev/disk/by-id/wwn-0x5002538f413394ae-part1 > /dev/sdd2 23 > /dev/disk/by-id/ata-Samsung_SSD_870_EVO_1TB_S626NF0R302507V-part2 > /dev/sdd2 24 /dev/disk/by-id/wwn-0x5002538f413394ae-part2 > /dev/sdd3 25 > /dev/disk/by-id/ata-Samsung_SSD_870_EVO_1TB_S626NF0R302507V-part3 > /dev/sdd3 26 /dev/disk/by-id/wwn-0x5002538f413394ae-part3 > /dev/sde27 /dev/disk/by-id/wwn-0x5002538f413394a9 > /dev/sde28 /dev/disk/by-id/ata-Samsung_SSD_870_EVO_1TB_S626NF0R302502E > /dev/sde1 29 > /dev/disk/by-id/ata-Samsung_SSD_870_EVO_1TB_S626NF0R302502E-part1 > /dev/sde1 30 /dev/disk/by-id/wwn-0x5002538f413394a9-part1 > /dev/sde2 31 > /dev/disk/by-id/ata-Samsung_SSD_870_EVO_1TB_S626NF0R302502E-part2 > /dev/sde2 32 /dev/disk/by-id/wwn-0x5002538f413394a9-part2 > /dev/sde3 33 > /dev/disk/by-id/ata-Samsung_SSD_870_EVO_1TB_S626NF0R302502E-part3 > /dev/sde3 34 /dev/disk/by-id/wwn-0x5002538f413394a9-part3 > /dev/sdf35 /dev/disk/by-id/wwn-0x5002538f413394a5 > /dev/sdf36 /dev/disk/by-id/ata-Samsung_SSD_870_EVO_1TB_S626NF0R302498T > /dev/sdf1 37 > /dev/disk/by-id/ata-Samsung_SSD_870_EVO_1TB_S626NF0R302498T-part1 > /dev/sdf1 38 /dev/disk/by-id/wwn-0x5002538f413394a5-part1 > /dev/sdf2 39 > /dev/disk/by-id/ata-Samsung_SSD_870_EVO_1TB_S626NF0R302498T-part2 > /dev/sdf2 40 /dev/disk/by-id/wwn-0x5002538f413394a5-part2 > /dev/sdf3 41 > /dev/disk/by-id/ata-Samsung_SSD_870_EVO_1TB_S626NF0R302498T-part3 > /dev/sdf3 42 /dev/disk/by-id/wwn-0x5002538f413394a5-part3 > /dev/sdg43 /dev/disk/by-id/ata-Samsung_SSD_870_EVO_1TB_S626NF0R302509W > /dev/sdg44 /dev/disk/by-id/wwn-0x5002538f413394b0 > /dev/sdg1 45 > /dev/disk/by-id/ata-Samsung_SSD_870_EVO_1TB_S626NF0R302509W-part1 > /dev/sdg1 46 /dev/disk/by-id/wwn-0x5002538f413394b0-part1 > /dev/sdg2 47 > /dev/disk/by-id/ata-Samsung_SSD_870_EVO_1TB_S626NF0R302509W-part2 > /dev/sdg2 48 /dev/disk/by-id/wwn-0x5002538f413394b0-part2 > /dev/sdg3 49 > /dev/disk/by-id/ata-Samsung_SSD_870_EVO_1TB_S626NF0R302509W-part3 > /dev/sdg3 50 /dev/disk/by-id/wwn-0x5002538f413394b0-part3 > /dev/sdh51 /dev/disk/by-id/ata-Gigastone_SSD_GSTD02TB230102 > /dev/sdh1 52 /dev/disk/by-id/ata-Gigastone_SSD_GSTD02TB230102-part1 > /dev/sdi53
Re: smartctl cannot access my storage, need syntax help
Il 16/01/2024 12:08, Felix Miata ha scritto: Tom Furie composed on 2024-01-16 08:18 (UTC): Felix Miata writes: /dev/sdc 18 /dev/disk/by-id/usb-Brother_MFC-J6920DW_BROG5F229909-0:0 # How does a printer get a storage device assignment??? By having some kind of SD card slot or similar. So this pollution only results from a USB-connected printer? IP printer connections don't cause it too? Yes, IP printers don't.
Re: smartctl cannot access my storage, need syntax help
Tom Furie composed on 2024-01-16 08:18 (UTC): > Felix Miata writes: >> /dev/sdc 18 /dev/disk/by-id/usb-Brother_MFC-J6920DW_BROG5F229909-0:0 # >> How does a printer get a storage device assignment??? > By having some kind of SD card slot or similar. So this pollution only results from a USB-connected printer? IP printer connections don't cause it too? -- Evolution as taught in public schools is, like religion, based on faith, not based on science. Team OS/2 ** Reg. Linux User #211409 ** a11y rocks! Felix Miata
Re: smartctl cannot access my storage, need syntax help
Felix Miata writes: > /dev/sdc 18 /dev/disk/by-id/usb-Brother_MFC-J6920DW_BROG5F229909-0:0 # > How does a printer get a storage device assignment??? By having some kind of SD card slot or similar.
Re: smartctl cannot access my storage, need syntax help
Felix Miata composed on 2024-01-16 01:05 (UTC-0500): > gene heskett composed on 2024-01-15 18:37 (UTC-0500): >> Ah,but I finally glombed onto the bug tan memory bar in htop as it was >> runniing, someplace in the data chain is a huge memory leak, my crash is >> caused by the OOM daemon killing things. And it only occurs when I run >> rsync. Only takes it 10 minute to eat 32G of memory, then 500k into >> swap, and the OOM daemon start killing the system until there's nothing >> left to run. > What does free report before starting rsync? Do you have all your swap on a > partition? Do you have any swapspace? > I would log out of XFCE, login on a vtty to open top, then login on another > to try > to run rsync. If that fails OOM too, since the target is ostensibly starting > from > scratch, use MC, and divide the job into the source's directories if > necessary. MC > gets rather bogged down if you try to do a bazillion individual files in a > single > copy operation. Trying to think outside the box, something else to think about, from the man page: [quote] --archive, -a This is equivalent to -rlptgoD. It is a quick way of saying you want recursion and want to preserve almost everything. Be aware that it does not include preserving ACLs (-A), xattrs (-X), atimes (-U), crtimes (-N), nor the finding and preserving of hardlinks (-H). [/quote] If rsync really is bugged, maybe a change of options would avoid the bug. Try instead of -av, -rlptgoDAXUNH. Could it be that verbosity is the OOM crippler, and not necessarily from rsync itself, but possibly from the xterm in which rsync is running? Does your source contain any hard links? Do you use ACLs or xattrs? -- Evolution as taught in public schools is, like religion, based on faith, not based on science. Team OS/2 ** Reg. Linux User #211409 ** a11y rocks! Felix Miata
Re: smartctl cannot access my storage, need syntax help
gene heskett composed on 2024-01-15 18:37 (UTC-0500): > Ah,but I finally glombed onto the bug tan memory bar in htop as it was > runniing, someplace in the data chain is a huge memory leak, my crash is > caused by the OOM daemon killing things. And it only occurs when I run > rsync. Only takes it 10 minute to eat 32G of memory, then 500k into > swap, and the OOM daemon start killing the system until there's nothing > left to run. What does free report before starting rsync? Do you have all your swap on a partition? Do you have any swapspace? I would log out of XFCE, login on a vtty to open top, then login on another to try to run rsync. If that fails OOM too, since the target is ostensibly starting from scratch, use MC, and divide the job into the source's directories if necessary. MC gets rather bogged down if you try to do a bazillion individual files in a single copy operation. -- Evolution as taught in public schools is, like religion, based on faith, not based on science. Team OS/2 ** Reg. Linux User #211409 ** a11y rocks! Felix Miata
Re: smartctl cannot access my storage, need syntax help
gene heskett composed on 2024-01-15 17:56 (UTC-0500): > Thanks for that composition: but it will be word wrapped: > root@coyote:~# for j in /dev/disk/by-id/* ; do printf '%s\t%s\n' > "$(realpath "$j")" "$j" ; done > /dev/sr0/dev/disk/by-id/ata-ATAPI_iHAS424_B_3524253_327133504865 > /dev/sdi/dev/disk/by-id/ata-Gigastone_SSD_GST02TBG221146 > /dev/sdj1 /dev/disk/by-id/ata-Gigastone_SSD_GST02TBG221146-part1 > /dev/sdh/dev/disk/by-id/ata-Gigastone_SSD_GSTD02TB230102 > /dev/sdh1 /dev/disk/by-id/ata-Gigastone_SSD_GSTD02TB230102-part1 > /dev/sdk/dev/disk/by-id/ata-Gigastone_SSD_GSTG02TB230206 > /dev/sdk1 /dev/disk/by-id/ata-Gigastone_SSD_GSTG02TB230206-part1 > /dev/sdf/dev/disk/by-id/ata-Samsung_SSD_870_EVO_1TB_S626NF0R302498T > /dev/sdf1 > /dev/disk/by-id/ata-Samsung_SSD_870_EVO_1TB_S626NF0R302498T-part1 > /dev/sdf2 > /dev/disk/by-id/ata-Samsung_SSD_870_EVO_1TB_S626NF0R302498T-part2 > /dev/sdf3 > /dev/disk/by-id/ata-Samsung_SSD_870_EVO_1TB_S626NF0R302498T-part3 > /dev/sde/dev/disk/by-id/ata-Samsung_SSD_870_EVO_1TB_S626NF0R302502E > /dev/sde1 > /dev/disk/by-id/ata-Samsung_SSD_870_EVO_1TB_S626NF0R302502E-part1 > /dev/sde2 > /dev/disk/by-id/ata-Samsung_SSD_870_EVO_1TB_S626NF0R302502E-part2 > /dev/sde3 > /dev/disk/by-id/ata-Samsung_SSD_870_EVO_1TB_S626NF0R302502E-part3 > /dev/sdd/dev/disk/by-id/ata-Samsung_SSD_870_EVO_1TB_S626NF0R302507V > /dev/sdd1 > /dev/disk/by-id/ata-Samsung_SSD_870_EVO_1TB_S626NF0R302507V-part1 > /dev/sdd2 > /dev/disk/by-id/ata-Samsung_SSD_870_EVO_1TB_S626NF0R302507V-part2 > /dev/sdd3 > /dev/disk/by-id/ata-Samsung_SSD_870_EVO_1TB_S626NF0R302507V-part3 > /dev/sdg/dev/disk/by-id/ata-Samsung_SSD_870_EVO_1TB_S626NF0R302509W > /dev/sdg1 > /dev/disk/by-id/ata-Samsung_SSD_870_EVO_1TB_S626NF0R302509W-part1 > /dev/sdg2 > /dev/disk/by-id/ata-Samsung_SSD_870_EVO_1TB_S626NF0R302509W-part2 > /dev/sdg3 > /dev/disk/by-id/ata-Samsung_SSD_870_EVO_1TB_S626NF0R302509W-part3 > /dev/sda/dev/disk/by-id/ata-Samsung_SSD_870_QVO_1TB_S5RRNF0T201730V > /dev/sda1 > /dev/disk/by-id/ata-Samsung_SSD_870_QVO_1TB_S5RRNF0T201730V-part1 > /dev/sda2 > /dev/disk/by-id/ata-Samsung_SSD_870_QVO_1TB_S5RRNF0T201730V-part2 > /dev/sda3 > /dev/disk/by-id/ata-Samsung_SSD_870_QVO_1TB_S5RRNF0T201730V-part3 > /dev/md0/dev/disk/by-id/md-name-coyote:0 > /dev/md0p1 /dev/disk/by-id/md-name-coyote:0-part1 > /dev/md2/dev/disk/by-id/md-name-coyote:2 > /dev/md1/dev/disk/by-id/md-name-_none_:1 > /dev/md0/dev/disk/by-id/md-uuid-3d5a3621:c0e32c8a:e3f7ebb3:318edbfb > /dev/md0p1 > /dev/disk/by-id/md-uuid-3d5a3621:c0e32c8a:e3f7ebb3:318edbfb-part1 > /dev/md1/dev/disk/by-id/md-uuid-57a88605:27f5a773:5be347c1:7c5e7342 > /dev/md2/dev/disk/by-id/md-uuid-bb6e03ce:19d290c8:5171004f:0127a392 > /dev/sdc/dev/disk/by-id/usb-Brother_MFC-J6920DW_BROG5F229909-0:0 > /dev/sdb/dev/disk/by-id/usb-USB_Mass_Storage_Device_816820130806-0:0 > /dev/sdf/dev/disk/by-id/wwn-0x5002538f413394a5 > /dev/sdf1 /dev/disk/by-id/wwn-0x5002538f413394a5-part1 > /dev/sdf2 /dev/disk/by-id/wwn-0x5002538f413394a5-part2 > /dev/sdf3 /dev/disk/by-id/wwn-0x5002538f413394a5-part3 > /dev/sde/dev/disk/by-id/wwn-0x5002538f413394a9 > /dev/sde1 /dev/disk/by-id/wwn-0x5002538f413394a9-part1 > /dev/sde2 /dev/disk/by-id/wwn-0x5002538f413394a9-part2 > /dev/sde3 /dev/disk/by-id/wwn-0x5002538f413394a9-part3 > /dev/sdd/dev/disk/by-id/wwn-0x5002538f413394ae > /dev/sdd1 /dev/disk/by-id/wwn-0x5002538f413394ae-part1 > /dev/sdd2 /dev/disk/by-id/wwn-0x5002538f413394ae-part2 > /dev/sdd3 /dev/disk/by-id/wwn-0x5002538f413394ae-part3 > /dev/sdg/dev/disk/by-id/wwn-0x5002538f413394b0 > /dev/sdg1 /dev/disk/by-id/wwn-0x5002538f413394b0-part1 > /dev/sdg2 /dev/disk/by-id/wwn-0x5002538f413394b0-part2 > /dev/sdg3 /dev/disk/by-id/wwn-0x5002538f413394b0-part3 > /dev/sda/dev/disk/by-id/wwn-0x5002538f42205e8e > /dev/sda1 /dev/disk/by-id/wwn-0x5002538f42205e8e-part1 > /dev/sda2 /dev/disk/by-id/wwn-0x5002538f42205e8e-part2 > /dev/sda3 /dev/disk/by-id/wwn-0x5002538f42205e8e-part3 > root@coyote:~# > but like I wrote, 2 pairs with identical "serial numbers", so the > assunption is that the last one overwrites the first on by udev, when > IMO it should be yelling about the duplicats. I straightened out the wrapping mess, and gave each entry a line number. I see nothing I recognize as representing serial number duplication among /dev/sdX (physical device) names: /dev/md0 1 /dev/disk/by-id/md-name-coyote:0 /dev/md0 2 /dev/disk/by-id/md-uuid-3d5a3621:c0e32c8a:e3f7ebb3:318edbfb /dev/md0p1 3 /dev/disk/by-id/md-name-coyote:0-part1 /dev/md0p1 4 /dev/disk/by-id/md-uuid-3d5a3621:c0e32c8a:e3f7ebb3:318edbfb-part1 /dev/md1 5 /dev/disk/by-id/md-name-_none_:1 /dev/md1 6
Re: smartctl cannot access mystorage, need syntax help
On Mon 15 Jan 2024 at 18:27:14 (-0500), gene heskett wrote: > On 1/15/24 14:57, David Wright wrote: > > On Sun 14 Jan 2024 at 20:15:16 (-0500), gene heskett wrote: > > > On 1/14/24 18:57, Felix Miata wrote: > > > > gene heskett composed on 2024-01-14 18:39 (UTC-0500): > > > > > Felix Miata wrote: > > > > > > My point was entirely about suitability of /mnt/ for fstab entries. > > > > > And my point is that for a one time copy, its was handy. I didn't have > > > > > to mkdir a mount point for it. Obviously you mean something different from what would be a conventional interpretation of those two sentences. > > > > /mnt/ is intended for one-time copies, just the ticket for that > > > > particular > > > > exercise. > > > I don't recall ever mounting something to /mnt, always to a subdir in > > > mount. > > > > How did you mount to a subdir in /mnt without making a mount point? > > Your two statements appear contradictory. > > On this particular ball of rock and water at least, called a planet in > most circles, you "mkdir /mnt/whatever". You don't have to mount > directly on /mnt and I don't think I ever have. Do it 50 times with > your own version of "whatever", same with any path to anyplace on the > system. Nothing special about it. And I've never created any mount point under /mnt. For a one time copy, /mnt is handy; always there, I don't have to mkdir at all. Cheers, David.
Re: smartctl cannot access my storage, need syntax help
On Mon 15 Jan 2024 at 20:31:55 (-0500), gene heskett wrote: > On 1/15/24 19:11, David Christensen wrote: > > On 1/15/24 16:03, gene heskett wrote: > > > On 1/15/24 18:41, gene heskett wrote: > > > > On 1/15/24 17:58, gene heskett wrote: > > > > > On 1/15/24 14:55, David Wright wrote: > > > > > > On Mon 15 Jan 2024 at 08:39:37 (-0500), gene heskett wrote: > > > > > > > ata-Gigastone_SSD_GST02TBG221146 > > > > > > > ata-Gigastone_SSD_GSTD02TB230102 > > > > > > > ata-Gigastone_SSD_GSTG02TB230206 > > > > > > > > > > > > these devices appear to have normal serial numbers. Do they bear > > > > > > any other indication, like engravings or stickers? If not, I would, > > > > > > in turn, plug each one in, read the serial number from its symlink, > > > > > > and write on it with a marker. While doing that, you could also > > > > > > run smartctl. > > > > > > > > > There is a sticker on the bottom containing the numbers you > > > see above, and a (upc?) bar code I don't have a reader for. > > > > So, two stickers have one number, two stickers have another > > number, and one sticker has a third number? Or, three stickers > > have one number, one sticker has another number, and the last > > stick has a third number? > > 5 ssd's > 3 unique numbers on those 5 stickerss the same numbers you can see > above. 2 drives with the same sticker, 2 more drive that have > identical sticks and one with a different sticker. I am inclined to > think the numbers are based on production batches, and not unique as > there may be 500 in each batch. Ouch. Well that leaves you with several choices, like exchanging two of them, or moving them to different machines, or using them for backing up but not at the same time as their twin. That's if your use case relies on their serial numbers. If you're using them in a more conventional manner, where UUIDs, LABELs, PARTUUIDs and PARTLABELs are stable, and serial numbers are ignored, then you should have no problems. Just start by inserting them separately for partitioning and filesystem creation with unique strings. But obviously step one is labelling them (unless you're exchanging two of them in nearly-new condition). BTW, I wrote: > You haven't shown any evidence of such LABELling, and most of your > anecdotal narratives don't give much confidence for us to really > know what was actually done. But to be fair, anything could > happen if the hardware is not working properly. Nothing at all there about lying. Cheers, David.
Re: smartctl cannot access my storage, need syntax help
On 1/15/24 17:31, gene heskett wrote: On 1/15/24 19:11, David Christensen wrote: On 1/15/24 16:03, gene heskett wrote: On 1/15/24 18:41, gene heskett wrote: On 1/15/24 17:58, gene heskett wrote: On 1/15/24 14:55, David Wright wrote: On Mon 15 Jan 2024 at 08:39:37 (-0500), gene heskett wrote: ata-Gigastone_SSD_GST02TBG221146 ata-Gigastone_SSD_GSTD02TB230102 ata-Gigastone_SSD_GSTG02TB230206 these devices appear to have normal serial numbers. Do they bear any other indication, like engravings or stickers? If not, I would, in turn, plug each one in, read the serial number from its symlink, and write on it with a marker. While doing that, you could also run smartctl. There is a sticker on the bottom containing the numbers you see above, and a (upc?) bar code I don't have a reader for. So, two stickers have one number, two stickers have another number, and one sticker has a third number? Or, three stickers have one number, one sticker has another number, and the last stick has a third number? 5 ssd's 3 unique numbers on those 5 stickerss the same numbers you can see above. 2 drives with the same sticker, 2 more drive that have identical sticks and one with a different sticker. I am inclined to think the numbers are based on production batches, and not unique as there may be 500 in each batch. Duplicate serial numbers are going to cause confusion. If any of the drives with duplicate numbers are eligible for return, I would return them. If not, perhaps you could resell them to somebody. If you are going to keep them, I seem to recall that all five drives were partitioned with GPT and had one large partition (?). You could invent a unique identifier for each drive, put a physical label on each drive, and assign the same identifier to each GPT partition label. Alternatively, UUID's and/or PARTUUID's should be unique for both MBR and GPT: # ls -l /dev/disk/by-uuid/ # ls -l /dev/disk/by-partuuid/ David
Re: smartctl cannot access my storage, need syntax help
On 1/15/24 19:11, David Christensen wrote: On 1/15/24 16:03, gene heskett wrote: On 1/15/24 18:41, gene heskett wrote: On 1/15/24 17:58, gene heskett wrote: On 1/15/24 14:55, David Wright wrote: On Mon 15 Jan 2024 at 08:39:37 (-0500), gene heskett wrote: ata-Gigastone_SSD_GST02TBG221146 ata-Gigastone_SSD_GSTD02TB230102 ata-Gigastone_SSD_GSTG02TB230206 these devices appear to have normal serial numbers. Do they bear any other indication, like engravings or stickers? If not, I would, in turn, plug each one in, read the serial number from its symlink, and write on it with a marker. While doing that, you could also run smartctl. There is a sticker on the bottom containing the numbers you see above, and a (upc?) bar code I don't have a reader for. So, two stickers have one number, two stickers have another number, and one sticker has a third number? Or, three stickers have one number, one sticker has another number, and the last stick has a third number? 5 ssd's 3 unique numbers on those 5 stickerss the same numbers you can see above. 2 drives with the same sticker, 2 more drive that have identical sticks and one with a different sticker. I am inclined to think the numbers are based on production batches, and not unique as there may be 500 in each batch. David . Cheers, Gene Heskett. -- "There are four boxes to be used in defense of liberty: soap, ballot, jury, and ammo. Please use in that order." -Ed Howdershelt (Author, 1940) If we desire respect for the law, we must first make the law respectable. - Louis D. Brandeis
Re: smartctl cannot access my storage, need syntax help
On 1/15/24 16:03, gene heskett wrote: On 1/15/24 18:41, gene heskett wrote: On 1/15/24 17:58, gene heskett wrote: On 1/15/24 14:55, David Wright wrote: On Mon 15 Jan 2024 at 08:39:37 (-0500), gene heskett wrote: ata-Gigastone_SSD_GST02TBG221146 ata-Gigastone_SSD_GSTD02TB230102 ata-Gigastone_SSD_GSTG02TB230206 these devices appear to have normal serial numbers. Do they bear any other indication, like engravings or stickers? If not, I would, in turn, plug each one in, read the serial number from its symlink, and write on it with a marker. While doing that, you could also run smartctl. There is a sticker on the bottom containing the numbers you see above, and a (upc?) bar code I don't have a reader for. So, two stickers have one number, two stickers have another number, and one sticker has a third number? Or, three stickers have one number, one sticker has another number, and the last stick has a third number? David