npath computes various measures of software complexity for C Language source files. For each of the input C source files, npath pipes the source file through the C Preprocessor, cpp(1). The output from cpp(1) is input by npath, parsed, and the complexity statistics computed. For example, the statistics for one of the source files in the TPOCC library look as follows:
NCSL Volume V(G) NPATH CLOC
---- ------ ------ ----- ------
LIBALEX.C;1:
date_and_time 3 186 1 1 0.3
str_detab 21 991 9 120 5.7
str_dupl 21 883 7 54 2.6
str_etoa 5 5008 3 4 0.8
str_free 12 395 5 6 0.5
str_index 9 404 5 16 1.8
str_insert 26 1158 11 864 33.2
str_lcat 6 264 3 4 0.7
str_lcopy 11 460 4 8 0.7
str_lowcase 6 279 4 8 1.3
str_upcase 6 279 4 8 1.3
str_match 6 208 3 3 0.5
str_trim 9 362 4 20 2.2
Summary - # of files: 1, # of modules: 13, # of NCSL: 141
Underneath the file name (a VMS file name, in this case) is a list of all the functions defined in the file. Following each function name are the following complexity figures:
After all the input source files have been processed, npath outputs a summary line totaling the number of files processed, the modules, and the lines of code.
Note that measurements are only made inside function bodies. Declarations outside the body of a function are not included in the counts.
"NCSL" is actually a misnomer for the first column of figures. npath only counts the number of statements (excluding declarations) in the body of the function. This is true for all the metrics measured by npath.
Halstead's Software Science metric is based on the number of operators and operands in a function. The length of a function is
total number of operators + total number of operands
The vocabulary of the function is
number of unique operators + number of unique operands
And, lastly, the volume of the function is defined as
length * log2 (vocabulary)
The sticky thing about the Software Science metric is deciding which things in a language are operators and which things are operands. npath uses the following conventions for the C language:
Operators
---------
break case continue default do else for
goto if return sizeof switch while
function call (Counts as one operator.)
{} () [] (Each pair counts as one operator.)
>>= <<= += -= *= /= %= &= ^= |= >> <<
++ -- -> && || <= >= == != ; , : = .
& ! ~ - + * / % < > ^ | ?
Operands
--------
Identifiers Numbers Characters ('x') Strings ("...")
Note that function calls get counted twice, both as an operator and as an operand (because of the identifier). It looked too difficult to do one and not the other. By the time the parser knows its dealing with a function call, it's not completely clear what the function name is - remember, the stuff preceding the left parenthesis might be a complicated expression that produces a function pointer.
McCabe's cyclomatic complexity metric, V(G), is basically the number of conditional statements (plus one) in a function. npath counts the following statements when calculating V(G):
case default if
while do for
Renaud suggests steering clear of routines with a V(G) greater than 10.
The NPATH metric computes the number of possible execution paths through a
function. It takes into account the nesting of conditional statements and
multi-part boolean expressions (e.g., A && B, C ||
D, etc.). Nejmeh says that his group had
an informal NPATH limit of 200 on individual routines; functions that
exceeded this value were candidates for further decomposition - or at least
a closer look.
The CLOC metric is simply a function's NPATH number divided by the number of executable statements (NCSL) in the function; i.e., it measures the complexity per line of code in the function. Lower values of CLOC can mean one of two things: you write very clear code or your code doesn't do much of anything. Higher values of CLOC can also mean one of two things: your code has lots of important things to do or you program in a very obtuse manner!
... and various other articles I can't recall. Nejmeh's NPATH article provided the inspiration for this program; see the article for more information about how the NPATH metric measures up to the others. Salt's Software Science article provides details on measuring the complexity of Pascal programs using Halstead's metric. I haven't read any of the original literature by Halstead or McCabe; my understanding of their metrics is derived from the different articles on metrics that I've read (mostly in SIGPLAN and CACM).
TYPE_NAME
tokens for type names and not IDENTIFIER tokens. npath
appears to handle these situations correctly. The "comp.compilers" USENET
group had a discussion about how hard it is parse C without writing a
full-blown compiler - they were right!
comp.sources.unix") after writing
my program. This collection of tools measures delivered source instructions
(like my NCSL), McCabe's metric, and Halstead's software science metric.
McCabe's metric is measured by an awk(1) script, Halstead's by a
lex(1) program. While my C parser-based npath program might
seem more sophisticated than Renaud's tools, Renaud, unlike me, knows what
he's talking about when it comes to metrics! Using code from a large
project, Renaud studied the relationships between the different metrics
and the maintenance history of the code.
CC/PREPROCESS is done by a version of
popen(3) that I wrote for VMS.
-cflow" command line option generates a
cflow(1)-style, textual structure chart.
% npath [-D...] [-I...] [-U...] [-nostdinc]
[-cflow] [-cpp ] [-c++] [-debug]
[-echo] [-exclude file] [-full]
[-long] [-longer]
[-nocpp] [-noheading] [-npath_debug]
[-verbose] [-verify level] [-vperror] [-yacc_debug]
source_file(s)
where
-D...-I...-U...-nostdinc- are C Preprocessor, cpp(1), options. These are passed on to cpp.
-cflow- causes npath to construct a calling hierarchy from the source files and to write a cflow(1)-style structure chart to stdout.
-cpp- specifies the pathname of the C Preprocessor. By default, this is
/lib/cppunder UNIX andCC/PREPROCESS_ONLY=SYS$OUTPUTunder VMS.-c++- causes npath to recognize certain C++ keywords ...
-debug- enables debug output (written to stdout).
-echo- causes the C source being scanned by the parser to be echoed to stdout. The C source is actually the preprocessed source output by the C Preprocessor.
-exclude- specifies the name of a file containing a list of functions that should be ignored when generating a structure chart. Commonly-used library routines are candidates for this file.
-full- causes npath to give the full path name for a function's source file when generating a cflow(1)-style structure chart. By default, npath only prints out the file name and extension.
-long- specifies that long module names will be encountered. npath will shift the columns of numbers to the right by one tab stop so that the columns line up (more or less).
-longer- is like the
-longoption, except that the columns of numbers are shifted two tab stops to the right.-nocpp- disables the preprocessing of source files. Normally, the source files are run through the C Preprocessor. With the -nocpp option, the source files are read directly.
-noheading- inhibits the output of column headings.
-npath_debug- enables npath-related debug output (written to stdout).
-verbose- enables verbose mode. In this mode, npath displays (to stderr) the name of each file as the file is processed. This is useful when you have redirected the module/statistics output (stdout) to a file.
-vperror- turns
vperror()message output on.vperror()messages are low-level error messages generated by libgpl functions; normally, they are disabled. If enabled, the messages are output to stderr.
npath's source distribution contains no README file - yet! - and a single Makefile for SunOS 4.1.3.
npath.tar.Z