...making Linux just a little more fun!

2-cent Tip: Wrapping a script in a timeout shell

Ben Okopnik [ben at linuxgazette.net]

Thu, 4 Jun 2009 08:22:35 -0500

----- Forwarded message from Allan Peda <tl082@yahoo.com> -----

From: Allan Peda <tl082@yahoo.com>
To: tag@lists.linuxgazette.net
Sent: Wednesday, May 20, 2009 11:34:27 AM
Subject: Two Cent tip
I have written previously on other topics for LG, and then IBM, but it's been a while, and I'd like to first share this without creating a full article (though I'd consider one).

This is a bit long for a two cent tip, but I wanted to share a solution I came up with for long running processes that sometimes hang for an indefinite period of time. The solution I envisioned was to launch the process with a specified timeout period, so instead of running the problematic script directly, I would "wrap" it within a timeout shell function, which is no-coincidentally called "timeout". This script could signal reluctant processes that their time is up, allowing the calling procedure to catch an OS error, and respond appropriately.

Say the process that sometimes hung was called "long_data_load"; instead of running it directly from the command line (or a calling script), I would call it using the function defined below.

The unwrapped program might be:

long_data_load arg_one arg_two .... etc

which, for a timeout limit of 10 minutes, this would then become:

timeout 10 long_data_load arg_one arg_two .... etc

So, in the example above, if the script failed to complete within ten minutes, it would instead be killed (using a hard SIGKILL), and an error would be retuned. I have been using this on a production system for two months, and it has turned out to be very useful in re-attempting network intensive procedures that sometimes seem never to complete. Source code follows:

# Allan Peda
# April 17, 2009
# function to call a long running script with a
# user set timeout period
# Script must have the executable bit set
# Note that "at" rounds down to the nearest minute
# best to use use full path
function timeout {
   if [[ ${1//[^[:digit:]]} != ${1} ]]; then
      echo "First argument of this function is timeout in minutes." >&2
      return 1
   declare -i timeout_minutes=${1:-1}
   # sanity check, can this be run at all?
   if [ ! -x $1 ]; then
      echo "Error: attempt to locate background executable failed." >&2
      return 2
   "$@" &
   declare -i bckrnd_pid=$!
   declare -i jobspec=$(echo kill -9 $bckrnd_pid |\
                        at now + $timeout_minutes minutes 2>&1 |\
                        perl -ne 's/\D+(\d+)\b.+/$1/ and print')
   # echo kill -9 $bckrnd_pid | at now + $timeout_minutes minutes
   # echo "will kill -9 $bckrnd_pid after $timeout_minutes minutes" >&2
   wait $bckrnd_pid
   declare -i rc=$?
   # cleanup unused batch job
   atrm $jobspec
   return $rc
# test case:
# ask child to sleep for 163 seconds
# putting process into the background, the reattaching
# but kill it after 2 minutes, unless it returns
# before then
# timeout 2 /bin/sleep 163
# echo "returned $? after $SECONDS seconds."

----- End forwarded message -----

* Ben Okopnik * Editor-in-Chief, Linux Gazette * http://LinuxGazette.NET *

Top    Back