Imagine you have a nice fancy pile of disks in a server’s SAS tray. You don’t have a RAID controller and are using software RAID - perhaps because you want to let ZFS manage it? Whatever the reason, you have it and things are cool. Until suddenly the inevitable happens and you have a disk fail. How the heck are you supposed to know which disk goes where? /dev/sdae
for example doesn’t tell you what slot! Fortunately, there’s a good chance you can blink the lights via sg_ses
- a utility to talk to the SCSI generic driver included with the kernel.
You will need the “sg3_utils” package. On RHEL derivatives (and SUSE), this is sg3_utils
as you might suspect. Debian derivatives, look for sg3-utils
. As well, if your storage isn’t using the SCSI generic driver or doesn’t support SCSI Enclosure Services, this won’t work for you.
Presumably if you’ve gotten to the point where you need to do all this, you know you have a disk failure and your software RAID and/or SMART is telling you which disk it is. In this document, we’ll assume the disk is /dev/sdae
. If you’ve never seen it before, if you run out of letters, a new letter is appended (ie, sdz
wraps to sdaa
).
There may be other/better ways to get at this bit of information, but I don’t know about them. This works on my system. Do a long list of /dev/disk/by-path
and look for items that are links to the disk in question. On our example system:
[root@redacted ~]# ls -l /dev/disk/by-path | egrep 'sdae$'
lrwxrwxrwx. 1 root root 10 Jan 25 17:24 pci-0000:03:00.0-sas-0x5000c50041f23b8a-lun-0 -> ../../sdae
lrwxrwxrwx. 1 root root 10 Jan 25 17:24 pci-0000:03:00.0-sas-exp0x500c04f2cc388cbf-phy18-lun-0 -> ../../sdae
“Hey, there’s more than one!” - yes, good eyes… Look for the hex code following “sas-“ - the one prefixed with “exp” is the “attached SAS address” and, I believe, is the SAS address of the enclosure the disk is installed within. You want the other one, which is 0x5000c50041f23b8a
in this example. This is the SAS address of the disk itself.
You want to identify /dev/sg*
devices that are disk enclosures. Fortunately the lsscsi
command can help:
[root@redacted ~]# lsscsi -g | grep -i encl
[0:0:24:0] enclosu DELL MD1220 1.06 - /dev/sg24
[0:0:49:0] enclosu DELL MD1220 1.06 - /dev/sg49
In this example, there are two - /dev/sg24
and /dev/sg49
In the previous command (to find enclosures) you can remove the grep command - you may find on your systems the ordering of disks and enclosures is predictable. I am not assuming this is true, so we do a bit more digging to be sure.
For each enclosure found earlier, you’ll want to run sg_ses --join /dev/sg49
(of course, changing the device argument as necessary):
[root@redacted ~]# sg_ses -j /dev/sg49
This will print out an absolute ton of information, so you will need to pipe it through a pager or redirect it to a file for your perusal. Look for the SAS address. For example, ours was found on /dev/sg49
and not /dev/sg24
.
Slot 6 [0,6] Element type: Array device slot
Enclosure Status:
Predicted failure=0, Disabled=0, Swap=0, status: Unknown
OK=0, Reserved device=0, Hot spare=0, Cons check=0
In crit array=0, In failed array=0, Rebuild/remap=0, R/R abort=0
App client bypass A=0, Do not remove=0, Enc bypass A=0, Enc bypass B=0
Ready to insert=0, RMV=0, Ident=0, Report=0
App client bypass B=0, Fault sensed=0, Fault reqstd=0, Device off=0
Bypassed A=0, Bypassed B=0, Dev bypassed A=0, Dev bypassed B=0
Additional Element Status:
Transport protocol: SAS
number of phys: 2, not all phys: 0, device slot number: 6
phy index: 0
device type: end device
initiator port for:
target port for: SSP
attached SAS address: 0x500c04f2cc388c3f
SAS address: 0x5000c50041f23b89
phy identifier: 0x0
phy index: 1
device type: end device
initiator port for:
target port for: SSP
attached SAS address: 0x500c04f2cc388cbf
SAS address: 0x5000c50041f23b8a
phy identifier: 0x1
This is a whole lot of useful information for sure, but we only care about two things:
SAS address: 0x5000c50041f23b8a
Slot 6 [0,6] Element type: Array device slot
Now for the punchline: ask the enclosure nicely to please flash the LED so you know what this slot in the OS actually maps to in the real world. There’s three different associated commands - GET, SET, and CLEAR. As with the rest of this writeup I’m using short arguments. In the examples below, NAME is the device name (ie, “Slot 6” found earlier) while ENCLOSURE is the enclosure device (ie, “/dev/sg49”)
sg_ses -D "NAME" -G "ident" ENCLOSURE
sg_ses -D "NAME" -S "ident" ENCLOSURE
sg_ses -D "NAME" -C "ident" ENCLOSURE
For example, if I wanted to enable/disable the locator LED in 10 second intervals (to make it easier to tell which is locator vs coincidental disk activity) you can use a simple loop as such:
while true; do
sg_ses -D "Slot 6" -S "ident" /dev/sg49
sleep 10
sg_ses -D "Slot 6" -C "ident" /dev/sg49
sleep 10
# CTRL-C when done
done
The information, views, and opinions published on this website were done so in the author's personal capacity. The information, views, and opinions expressed in this article are the author's own and do not reflect the view of their employer, or any other entity unless explicitly stated otherwise.
All data and information provided on this site is for informational purposes only. This website and it's operators makes no representations as to accuracy, completeness, currentness, suitability, or validity of any information on this site and will not be liable for any errors, omissions, or delays in this information or any losses, injuries, or damages arising from its display or use. All information is provided on an as-is basis.
All original content on this website is, unless explicitly stated otherwise, licensed under the MIT license. Full license text is available here. Non-original content that is included on this website in whole or in part, linked, or otherwise made available remains under copyright of the original owners.