On 3/17/2024 4:32 AM, Andrea Venturoli wrote:
On 3/15/24 19:17, mike tancsa wrote:

(da5:mpr0:0:15:0): SCSI sense: UNIT ATTENTION asc:29,0 (Power on, reset, or bus device reset occurred)

Hello.
I know I'm probably blaming the wrong component, but is your PSU up to the task? How many drives do you have? Are they power-hungrier than the others you tried (Samsung ???)?
Do you have a spare PSU to test/add?

Probably this is not the cause... still, before you bit farewell to 400 bucks...


hehe, thanks Andrea :)  I too dont want to be out the money. Power supply for sure is a good thing to check. In this case, the main server chassis is sized with a couple of redundant 1000W power supplies that should handle 12 full HDDs. Pretty sure in this case 6 SSDs should not stress it beyond the point. But I had 2 other test boxes on the bench and the one common variable seems to be the WDs.

I feel like this is a sunk cost I am pushing myself into, but I did do some more testing.  My co-worker came across this post which was interesting.

https://forum.hddguru.com/viewtopic.php?f=10&t=43284

The very last entry says

"For WD BLUE SA 510 there are some problems with this type of SSD. This YODA model
To fix the SSD if it is still recognized, use the firmware update tools.
And then do a secure erase or full wipe of the SSD. After this it will work well. I can give you a link to this utility if it necessary. Also ossible download it from manufacture FTP. If it is not recognized by the computer or is identified as a SSD device, there only one way, use production tools with new firmware to begin the production process by testing the controller and NAND chip and forming a translator. The SSD will be like brand new.
"

After I did the erase, the tests worked for a good 5 cycles and performance was MUCH smoother and consistent. But then the drives started to fail again.  So I really wonder if TRIM has something to do with it as my test is essentially writing a 250G data set with about 28 million txt files, destroying the dataset and then copying it again.

I noticed these 2 commits for other drives. I wonder if the WD is having similar issues.

https://cgit.freebsd.org/src/commit/?h=stable/14&id=bf11fee6a5cf97102f87695185cadb63d5a2a7de
and
https://cgit.freebsd.org/src/commit/?h=stable/14&id=50aa22323424ccea00ef5d8f24e729a480cc77eb

I hope you dont mind bcc'ing you Andriy.  I noticed you only added the NCQ quirks for CAM ata and not for CAM scsi. I am running into odd issues with some WD drives and wondering if there is the same root limitation of these WD SA 510 drives like the Samsungs ? However, in my use of the Samsungs I have not been able to trigger these bugs so far.

    ---Mike

Reply via email to