Tivoli Monitoring with Nagios-friendly TSMmonitor

Here is a cool little tool to babysit IBM’s Tivoli Storage Manager. Its features are pretty fly:

  • Supports multiples tsm servers (servername)
  • Can be used transparently as a nagios plugin

    Alert notification mechanism (by e-mail)

    Supports new values for ok/warning/critical status in command line

    Bourne shell (sh) compliance

    Easy to add news checks

    Here are some more examples of its coolness:

    ———————————————————————
    show all checks helpUsage..: tsmmonitor help
    Example: tsmmonitor help
    ———————————————————————
    check tsm database utilization

    The default percentages are:
    warning..: 85
    critical.: 90

    Usage..: tsmmonitor db [warning] [critical]
    Example: tsmmonitor db
    tsmmonitor db 80 95
    ———————————————————————
    check tsm recovery log utilization

    The default percentages are:
    warning..: 60
    critical.: 80

    Usage..: tsmmonitor log [warning] [critical]
    Example: tsmmonitor log
    tsmmonitor log 80 95
    ———————————————————————
    check scratch tapes minimum number

    The default numbers are:
    warning..: 10
    critical.: 6

    Usage..: tsmmonitor scratch [warning] [critical] [library_name]
    Example: tsmmonitor scratch
    tsmmonitor scratch 8 4
    tsmmonitor scratch 8 4 LTOLIB3
    tsmmonitor scratch LTOLIB3
    ———————————————————————
    check number of drives not online

    The default numbers are:
    warning..: 1
    critical.: 2

    Usage..: tsmmonitor drive [warning] [critical] [library_name]
    Example: tsmmonitor drive
    tsmmonitor drive 2 3
    tsmmonitor drive 1 2 LTOLIB3
    tsmmonitor drive LTOLIB3
    ———————————————————————
    check number of paths not online

    The default numbers are:
    warning..: 1
    critical.: 2

    Usage..: tsmmonitor path [warning] [critical]
    Example: tsmmonitor path
    tsmmonitor path 2 4
    ———————————————————————
    check tsm database fragmentation

    The default numbers are:
    warning..: 60
    critical.: 80

    Usage..: tsmmonitor dbfrag [warning] [critical]
    Example: tsmmonitor dbfrag
    tsmmonitor dbfrag 50 75
    ———————————————————————
    check number of unavailable volumes

    The default numbers are:
    warning..: 1
    critical.: 2

    Usage..: tsmmonitor unav [options] [warning] [critical] [device_class]
    -v, show unavailable volumes
    Example: tsmmonitor unav -v
    tsmmonitor unav 2 4
    tsmmonitor unav 2 4 LTOCLASS
    ———————————————————————
    check number of pending requests (query request)

    The default numbers are:
    warning..: 1
    critical.: 2

    Usage..: tsmmonitor req [warning] [critical]
    Example: tsmmonitor req
    tsmmonitor req 2 3
    ———————————————————————
    check a storage pool utilization

    The default numbers are:
    warning..: 80
    critical.: 95

    Usage..: tsmmonitor stgpool [warning] [critical]
    Example: tsmmonitor stgpool DISK_POOL
    tsmmonitor stgpool DISK_POOL 50 75
    ———————————————————————
    check for volumes with error (error_state)

    The default numbers are:
    warning..: 1
    critical.: 2

    Usage..: tsmmonitor volerr [options] [warning] [critical] [device_class]
    -v, show volumes with error
    Example: tsmmonitor volerr
    tsmmonitor volerr -v 3 5
    tsmmonitor volerr 3 5 LTOCLASS
    ———————————————————————
    check how many tapes are in the library

    The default numbers are:
    warning..: 90
    critical.: 86

    Usage..: tsmmonitor tapeslib [warning] [critical] [library_name]
    Example: tsmmonitor tapeslib
    tsmmonitor tapeslib 120 115
    tsmmonitor tapeslib 120 115 LTOLIB3
    tsmmonitor tapeslib LTOLIB3
    ———————————————————————
    check how many tapes have a specific owner

    The default numbers are:
    warning..: 2
    critical.: 3

    Usage..: tsmmonitor tapesown [warning] [critical]
    Example: tsmmonitor tapesown tsmsrv01
    tsmmonitor tapesown tsmsrv01 4 5
    ———————————————————————
    check how many volumes are in a specific storage pool

    The default numbers are:
    warning..: 40
    critical.: 50

    Usage..: tsmmonitor tapesstgpool [warning] [critical]
    Example: tsmmonitor tapesstgpool DAILY
    tsmmonitor tapesstgpool DAILY 30 45
    ———————————————————————
    check how many tsm db backup there are in the last 24 hours

    The default numbers are:
    warning..: 1
    critical.: 0

    Usage..: tsmmonitor dbbkp [options] [warning] [critical]
    -v, show some informations about database backup
    Example: tsmmonitor dbbkp
    tsmmonitor dbbkp -v
    tsmmonitor dbbkp 2 1
    ———————————————————————
    check number of nodes sessions

    The default numbers are:
    warning..: 15
    critical.: 20

    Usage..: tsmmonitor numsess [warning] [critical] [session_state]
    Example: tsmmonitor numsess
    tsmmonitor numsess 20 30
    tsmmonitor numsess 20 30 MediaW
    tsmmonitor numsess Run
    ———————————————————————
    check number of nodes

    The default numbers are:
    warning..: 80
    critical.: 90

    Usage..: tsmmonitor numnodes [warning] [critical] [domain]
    Example: tsmmonitor numnodes
    tsmmonitor numnodes 20 30
    tsmmonitor numnodes 20 30 SAP
    tsmmonitor numnodes SAP
    ———————————————————————
    check number of disk volumes without readwrite access

    The default numbers are:
    warning..: 1
    critical.: 2

    Usage..: tsmmonitor diskvol [options] [warning] [critical]
    -v, show volumes without readwrite access
    Example: tsmmonitor diskvol
    tsmmonitor diskvol -v
    tsmmonitor diskvol 2 3
    tsmmonitor diskvol -v 2 3
    ———————————————————————
    check number of database volumes not synchronized (copy status)

    The default numbers are:
    warning..: 1
    critical.: 2

    Usage..: tsmmonitor dbvol [warning] [critical]
    Example: tsmmonitor dbvol
    tsmmonitor dbvol 2 3
    ———————————————————————
    check number of log volumes not synchronized (copy status)

    The default numbers are:
    warning..: 1
    critical.: 2

    Usage..: tsmmonitor logvol [warning] [critical]
    Example: tsmmonitor logvol
    tsmmonitor logvol 2 3
    ———————————————————————
    check server license compliance

    Usage..: tsmmonitor lic
    Example: tsmmonitor lic


    About this entry