Directory tree scripting using find and BASH

How to run a script once for each directory in an entire directory tree.

Sometimes you'll need to run a command or a script on every directory within a directory tree. (By directory tree I just mean every directory below a specified root directory, including all the subdirectories of those directories, and so on.) This page gives an example of how to run a BASH script on an entire directory tree. I'm using Ubuntu 8.04, but the same method should work in any distribution of Linux, and possibly in BSD and UNIX operating systems too.

The BASH script in this example tags FLAC audio files using the metaflac command (see my article about Replay Gain in Linux). However, it's the find command that does the work that's most useful to this article.

The find command

The find command is pretty handy. It can generate a list of files or directorise in or below a specified directory, and it can also execute a command on each item in that list of results. (Note: if you're using a multi-user system, make sure to read the security considerations section of the Findutils documentation for the find command.)

The options to the find command that we need are -type d which tells find to return the names of every directory within the specified path, and -exec command '{}' \; which tells find to run command on the results. For example:

find /media/music/flac -type d -exec ~/tag-flac-with-rg.sh '{}' \;

This tells find to return every directory in or below the path /media/music/flac and then run the tag-flac-with-rg.sh script on each directory found. The '{}' is replaced by each directory name found, and gets passed as an argument to the tag-flac-with-rg.sh script. The \; marks the end of the arguments list to the tag-flac-with-rg.sh script.

Wrapping the find command in a script

If you're going to use such a find command often, with different directory targets, you can wrap it into a script with a simple name, so that you needn't type out the whole thing every time.

Create a new text file (in our example for metaflac tagging, I've called the file tfwrg.sh) and enter the following into the file:

#!/bin/bash

if [ ! -d "$1" ]
then
	echo "Arg "$1" is NOT a directory!"
	exit 1
fi

find "$1" -type d -exec ~/tag-flac-with-rg.sh '{}' \;

The first line tells the command line to use BASH to interpret the script. The next block checks that the argument you pass to this script is actually a directory path (and exits with an error value if not). The final line is the find command. Note that "$1" is replaced with the argument passed to this script.

Now you can just type ~/tfwrg.sh /media/music/flac to call that long find command, and the /media/music/flac path is passed to the find command (after being confirmed as a directory) as the root of a directory tree. Note that if the directory path contains spaces, you need to surround the entire path with double-quotes.

The per-directory script

Now that we know how to generate a list of directories in a directory tree, we need to write the script that gets called to deal with each directory in that list. In our example, the script is tag-flac-with-rg.sh, and it uses the metaflac command to tag FLAC files with Replay Gain data.

To see how to use the metaflac command see my page about Replay Gain in Linux. Because metaflac won't do anything if a directory doesn't contain FLAC files, you could just call metaflac directly in the find command. But that wouldn't be very user friendly, as metaflac generates no output at all when tagging files.

Instead, the tag-flac-with-rg.sh script uses echo to let the user know which directory it's currently working on, counts how many FLAC files it finds in that directory, and then calculates and outputs the values for each FLAC file in that directory before exiting. When there's a large number of directories to get through, this sort of feedback lets the user know that the script is actually doing something.

You can download both scripts below, but here's the bulk of the code in the tag-flac-with-rg.sh script:

# Check that the argument passed to this script is a directory.
# If it's not, then exit with an error code.
if [ ! -d "$1" ]
then
	echo "Arg "$1" is NOT a directory!"
	exit $ARGUMENT_NOT_DIRECTORY
fi

# Count the number of FLAC files in this directory.
flacnum=`ls "$1" | grep -c \\.flac`

# If no FLAC files are found in this directory,
# then exit without error.
if [ $flacnum -lt 1 ]
then
	echo $1" (No FLAC files, moving on)"
	exit 0
else
	echo $1" ("$flacnum" FLAC files)"
fi

# Run metaflac on the FLAC files in this directory.
echo "Calculating Replay Gain values for FLAC files."
metaflac --add-replay-gain "$1"/*.flac

# Output the newly-created Replay Gain values for the FLAC
# files in this directory.
echo "Newly-calculated Replay Gain values:"
flacfiles=`ls -1 "$1"/*.flac`
IFS=$'\012' # separate file names correctly (use newlines)
for file in $flacfiles
do
	if [ ! -e "$file" ]
	then
		# This should not happen.
		echo "Error: file "$file" not found."
		exit $FILE_NOT_FOUND
	fi

	echo $file
	metaflac --show-tag=REPLAYGAIN_TRACK_GAIN "$file"
	metaflac --show-tag=REPLAYGAIN_ALBUM_GAIN "$file"
	echo ""
done

Note that this script also checks that the argument passed to it (again represented by "$1") is a directory, just in case someone calls this script directly instead of using find or tfwrg.sh. It also counts how many FLAC files are in the current directory (using the handy -c option in grep), and exits gracefully if none are present so that the find command can move onto the next directory.

The argument $1 is contained within double-quotes in all places in the script, just in case the directory path contains spaces (which would break most commands). The exceptions to this are the calls to the echo command, which treats spaces as just another character to send to the output.

The last block of the script, which outputs the values for each FLAC file, uses the ls command to create a variable which contains all of the FLAC files in the current directory, separated by newline characters. The IFS (internal field separator) variable needs to be set to the value of $'\012' so that the for file in $flacfiles mechanism knows that the filenames in the variable are separated by newline characters. (012 is octal for character ten, which is the newline character.)

Download the tfwrg.sh and tag-flac-with-rg.sh scripts

Download the gzipped tarball:

Tarball: tfwrg.tar.gz

and then copy it into the directory where you want to keep the script files. (The examples above assume that both files will be in your home directory.) Then open a command prompt and change to the directory where the tarball is, and type the following two commands:

gunzip tfwrg.tar.gz
tar -xvvf tfwrg.tar

This should extract tfwrg.sh and tag-flac-with-rg.sh into that directory. You will probably need to change the owner and permission details (using chown and chmod) so that you can run the scripts.