Stefan's CV

When developers discuss COBOL today, it’s usually with a mixture of dread and morbid curiosity. The language that powered banking systems and insurance companies for decades has become synonymous with technical debt and unmaintainable legacy code. But what if we’re looking at it wrong?

Like an archaeologist examining ancient pottery to understand civilization, we can learn valuable lessons by actually writing COBOL in 2025. Not as a historical exercise, but as a genuine exploration of what made this language successful enough to still process an estimated 95% of ATM transactions and 80% of in-person credit card purchases worldwide.

I built a COBOL matrix mathematics library with modern Python integration to understand what COBOL does well, what it does poorly, and what lessons from 1959 still matter in contemporary software development. The results were surprising.

The Repository: A Bridge Between Eras

The project is straightforward: seven COBOL programs implementing matrix operations (addition, subtraction, multiplication, division, transposition) compiled to shared libraries and called from Python via ctypes. Each COBOL program handles the numerical heavy lifting while Python manages the interface and orchestration.

This isn’t just a toy project for nostalgia. It demonstrates real interoperability between a 66-year-old language and modern tooling, revealing both the durability of COBOL’s design and the reasons it has persisted in production systems.

Lesson 1: Explicit is Better Than Implicit

Consider this COBOL data definition from the matrix addition program:

01  MATRIX-A.
    05  NUM-ROWS-A      PIC S9(5) SIGN LEADING SEPARATE.
    05  NUM-COLS-A      PIC S9(5) SIGN LEADING SEPARATE.
    05  ELEMENT-A       PIC S9(5)V99 SIGN LEADING SEPARATE
                        OCCURS 100 TIMES.

01  MATRIX-B.
    05  NUM-ROWS-B      PIC S9(5) SIGN LEADING SEPARATE.
    05  NUM-COLS-B      PIC S9(5) SIGN LEADING SEPARATE.
    05  ELEMENT-B       PIC S9(5)V99 SIGN LEADING SEPARATE
                        OCCURS 100 TIMES.

Every aspect of this data structure is explicitly specified:

S9(5) - Signed numeric with 5 digits
V99 - Implicit decimal point with 2 decimal places
SIGN LEADING SEPARATE - Sign character stored separately at the front
OCCURS 100 TIMES - Array of exactly 100 elements

There’s no ambiguity about representation, size, or layout. COBOL doesn’t let you declare an “int” and hope the compiler picks something reasonable. You specify exactly what you want, and that’s what you get.

This verbosity, often mocked, serves a critical purpose: auditability. When a financial calculation produces an unexpected result, you can trace through the COBOL code and know exactly how each variable is represented in memory. No surprises from compiler optimizations, no guessing about floating-point precision.

Modern parallel: Strong typing in TypeScript or Rust serves similar goals, but COBOL went further by making the physical representation explicit. Contemporary systems languages like Zig are rediscovering this principle.

Lesson 2: Fixed-Point Arithmetic Matters

One of COBOL’s most underappreciated features is its native fixed-point decimal arithmetic. The PIC S9(5)V99 declaration doesn’t use IEEE 754 floating-point - it uses scaled integer arithmetic with a fixed number of decimal places.

Why does this matter, unless you’re a devout IEEE 754 hater? Consider this:

# Python floating-point (typical language)
balance = 100.10
interest_rate = 0.035
interest = balance * interest_rate
# Result: 3.5035000000000003

That trailing 0000000003 is a floating-point representation error. In production banking systems, these tiny errors compound over millions of transactions and billions of dollars.

COBOL avoids this entirely:

01  BALANCE          PIC S9(7)V99.
01  INTEREST-RATE    PIC S9(1)V9999.
01  INTEREST         PIC S9(7)V99.

COMPUTE INTEREST = BALANCE * INTEREST-RATE.

The V99 specification means two decimal places, stored as an integer scaled by 100. 100.10 is stored as 10010, and arithmetic operations maintain exact precision. When you divide, COBOL rounds according to the picture clause specification, not according to floating-point rounding modes.

This is a massive reason why financial systems still use COBOL. Not because of inertia (though that plays a role), but because the language was designed from day one for precise decimal arithmetic on business data.

Modern languages have learned this lesson: Python’s decimal module, Java’s BigDecimal, and Rust’s rust_decimal crate all implement similar fixed-point arithmetic. But these are library add-ons, not native language features. In COBOL, you get it by default.

Lesson 3: Structure Through Rigidity

COBOL programs follow an inflexible four-division structure:

IDENTIFICATION DIVISION.
PROGRAM-ID. MATADD.

ENVIRONMENT DIVISION.

DATA DIVISION.
WORKING-STORAGE SECTION.
LINKAGE SECTION.

PROCEDURE DIVISION USING ...

You must declare all data before writing any procedures. You cannot intermix declarations and logic. Variables are scoped by their division and section, not by block nesting.

This rigidity feels archaic until you’re debugging a 50,000-line COBOL program written in 1987. Every program follows the same structure. Data is always at the top. Procedures always at the bottom. No surprises.

Compare this to C or JavaScript, where variables can be declared anywhere, functions can be nested, and code organization is a matter of team convention. COBOL enforces organization through language design.

We see this elsewhere: Go’s strict formatting via gofmt, Rust’s module system requirements, and Python’s PEP 8 conventions all attempt to impose structure that COBOL mandated from the beginning. The difference is enforcement - COBOL won’t compile if you violate the structure; modern languages rely on linters and code review.

Lesson 4: Error Handling Without Exceptions

COBOL predates the concept of exceptions. Every program in this repository uses return codes:

PROCEDURE DIVISION USING MATRIX-A, MATRIX-B, RESULT-MATRIX,
                         RETURN-CODE.

    IF NUM-ROWS-A NOT = NUM-ROWS-B OR
       NUM-COLS-A NOT = NUM-COLS-B
        MOVE -1 TO RETURN-CODE
        GOBACK
    END-IF.

    PERFORM ADDITION-LOGIC.
    MOVE 0 TO RETURN-CODE.
    GOBACK.

The calling code must explicitly check RETURN-CODE to know if the operation succeeded. No stack unwinding, no exception propagation, no automatic cleanup.

This seems primitive, but it has benefits:

Explicit error paths: You cannot ignore errors - they must be checked at each call site
Predictable control flow: No hidden jumps from deep in the call stack
Audit trail: Every error check is visible in the code

Modern systems programming has rediscovered these principles. Go uses explicit error returns. Rust uses Result<T, E> types that must be handled. Both reject exceptions in favor of explicit error propagation.

The COBOL approach forces defensive programming. You cannot forget to handle errors because your code won’t work if you do. The compiler won’t help you (no Result type system), but the calling convention makes ignoring errors impossible.

Lesson 5: The Cost of Interoperability

Integrating COBOL with Python revealed the work required to bridge languages designed in different eras. The Python wrapper needs to:

Pack data into COBOL’s expected format:

def pack_number(num):
    sign_byte = b'+' if num >= 0 else b'-'
    abs_num = abs(num)
    integer_part = int(abs_num)
    decimal_part = int(round((abs_num - integer_part) * 100))
    combined = integer_part * 100 + decimal_part
    number_str = f"{combined:06d}".encode('ascii')
    return sign_byte + number_str

Unpack COBOL’s output format:

def unpack_number(byte_data):
    sign_byte = byte_data[0:1]
    number_bytes = byte_data[1:7]
    scaled_int = int(number_bytes.decode('ascii'))
    number = scaled_int / 100.0
    if sign_byte == b'-':
        number = -number
    return number

Flatten multi-dimensional arrays: COBOL doesn’t have multi-dimensional arrays in the modern sense - it has single-dimensional arrays that you index manually. Matrix data must be serialized to flat buffers.

This marshaling code is tedious and error-prone. A single byte offset error causes silent corruption. Different platforms (Linux vs macOS) have different shared library conventions. COBOL’s SIGN LEADING SEPARATE representation varies by compiler.

The lesson: Interoperability has a cost. Modern languages often hide this cost with FFI libraries and automatic bindings, but someone still writes the bridge code. In mature COBOL systems, this bridge code has been battle-tested over decades. In new integrations, it’s a source of subtle bugs.

This is why replacing legacy COBOL systems is so difficult. It’s not just rewriting the business logic - it’s rebuilding all the interoperability glue that has accumulated over time.

Lesson 6: Compilation to Native Code Matters

Each COBOL program compiles to a shared library (.so file on Linux/macOS):

cobc -m -o simpleadd.so simpleadd.cob
cobc -m -o matadd.so matadd.cob

These aren’t interpreted or JIT-compiled - they’re native machine code that runs at full processor speed. The build script compiles seven programs in under a second on a modern machine.

Python loads these libraries at runtime:

lib = ctypes.CDLL('./matadd.so')
lib.MATADD.argtypes = [
    ctypes.POINTER(MatrixA),
    ctypes.POINTER(MatrixB),
    ctypes.POINTER(ResultMatrix),
    ctypes.POINTER(ReturnCode)
]

The COBOL code executes without Python interpreter overhead. For numerical operations, this is significantly faster than pure Python. The matrix multiplication implementation, with its triple-nested loops, benefits enormously from native execution.

This is similar to calling into NumPy (which wraps C/Fortran libraries) or using Rust libraries from Python via PyO3. The pattern of using a high-level language for orchestration and a compiled language for computation has been rediscovered repeatedly.

COBOL systems pioneered this approach in the 1960s with mainframe subroutines. The architecture remains valid today.

Lesson 7: Procedural Decomposition Before Functions

COBOL’s PERFORM statement creates named code sections without traditional function calls:

PROCEDURE DIVISION USING MATRIX-A, MATRIX-B, RESULT, RETURN-CODE.
MAIN-LOGIC.
    PERFORM VALIDATE-DIMENSIONS.
    IF RETURN-CODE NOT = 0
        GOBACK
    END-IF.

    PERFORM LOAD-MATRICES.
    PERFORM MULTIPLY-MATRICES.
    PERFORM STORE-RESULT.
    GOBACK.

VALIDATE-DIMENSIONS.
    IF NUM-ROWS-A NOT = NUM-COLS-B
        MOVE -1 TO RETURN-CODE
    END-IF.

MULTIPLY-MATRICES.
    PERFORM VARYING I FROM 1 BY 1 UNTIL I > NUM-ROWS-A
        PERFORM VARYING J FROM 1 BY 1 UNTIL J > NUM-COLS-B
            ...calculation...
        END-PERFORM
    END-PERFORM.

This isn’t structured like modern functions with local scopes and return values. All variables are global to the program. PERFORM transfers control to a named section, then returns when that section completes.

Yet this achieves procedural decomposition. Complex operations are broken into named, reusable pieces. The main logic reads almost like pseudocode: validate, load, multiply, store.

The limitation: no parameter passing, no local variables, no return values. Everything communicates through shared data structures. This makes testing individual procedures difficult and creates coupling between sections.

Modern structured programming added proper functions with parameters and returns. But COBOL’s PERFORM showed that decomposition was possible even without these features. In codebases where every function accesses shared state anyway (game engines with global entity lists, embedded systems with hardware registers), COBOL’s approach is more honest about the coupling.

Lesson 8: Tests Are Newer Than You Think

The test suite for this COBOL library is in Python, not COBOL:

def test_matrix_addition():
    result = matrix_add([[1.5, 2.5], [3.5, 4.5]],
                        [[0.5, 0.5], [0.5, 0.5]])
    expected = [[2.0, 3.0], [4.0, 5.0]]
    assert result == expected

def test_dimension_mismatch():
    result = matrix_add([[1, 2]], [[1], [2]])
    assert result is None  # Error case

COBOL has no built-in unit testing framework. No test runners, no assertion libraries, no mocking. The language predates the concept of automated testing by decades.

In production COBOL systems, testing typically happens through:

Manual execution with test data
Batch job runs compared against expected output files
Integration tests of entire workflows

The lack of unit testing infrastructure isn’t a COBOL-specific problem - it’s generational. Languages designed before the 1990s (C, Fortran, Pascal) have bolt-on testing frameworks, not native support.

This archaeological insight matters: when you encounter legacy COBOL without tests, it’s not necessarily because the original developers were careless. The practice of writing unit tests alongside code simply didn’t exist when most COBOL systems were built.

Modern refactoring of legacy systems must account for this. You cannot just “add tests” to COBOL - you need to build test infrastructure around it, as this project does with Python.

Lesson 9: Build Systems Are Also Legacy

The build script for this project does something modern build systems have forgotten:

compile_program() {
    local program=$1
    echo -e "${BLUE}Compiling ${program}...${NC}"

    if cobc -m -o "${program%.cob}.so" "$program" 2>&1 | tee /tmp/cobc_error.log; then
        echo -e "${GREEN}✓ Successfully compiled ${program}${NC}"
        ((success_count++))
    else
        echo -e "${RED}✗ Failed to compile ${program}${NC}"
        cat /tmp/cobc_error.log
        ((fail_count++))
    fi
}

It provides immediate, colored feedback for each compilation. It reports overall success/failure. It’s 63 lines of straightforward Bash that anyone can understand and modify.

Modern build systems like Bazel, Buck, or Gradle are powerful but require significant expertise. Simple projects accumulate hundreds of lines of configuration. Build failures produce pages of diagnostic output.

The COBOL build script is refreshingly direct. Need to add a new program? Add one line:

compile_program "newprogram.cob"

This simplicity is possible because COBOL’s compilation model is simple: one source file compiles to one shared library, with no dependency resolution beyond system libraries. Modern languages with package managers and transitive dependencies cannot be this simple.

But the lesson remains: sometimes a shell script is the right tool. Not every project needs a complex build system. COBOL’s age means it predates build system complexity, forcing simpler solutions.

Lesson 10: Documentation in Code

COBOL enforces documentation through syntax:

IDENTIFICATION DIVISION.
PROGRAM-ID. MATMULT.
AUTHOR. COBOL-MATRIX-API.
DATE-WRITTEN. 2025-01-15.

ENVIRONMENT DIVISION.
CONFIGURATION SECTION.
SOURCE-COMPUTER. GNU-COBOL.
OBJECT-COMPUTER. GNU-COBOL.

These divisions are mandatory. Every program declares its identity, author, and date. While often filled with boilerplate, the structure enforces that metadata exists.

The verbose syntax itself acts as documentation:

PERFORM VARYING ROW-IDX FROM 1 BY 1 UNTIL ROW-IDX > NUM-ROWS-A
    PERFORM VARYING COL-IDX FROM 1 BY 1 UNTIL COL-IDX > NUM-COLS-B
        MOVE 0 TO RESULT-ELEMENT(ROW-IDX, COL-IDX)
        PERFORM VARYING K FROM 1 BY 1 UNTIL K > NUM-COLS-A
            COMPUTE RESULT-ELEMENT(ROW-IDX, COL-IDX) =
                RESULT-ELEMENT(ROW-IDX, COL-IDX) +
                (ELEMENT-A(ROW-IDX, K) * ELEMENT-B(K, COL-IDX))
        END-PERFORM
    END-PERFORM
END-PERFORM

You can read this aloud and understand it. The verbosity that makes COBOL tedious to write makes it easier to read. When maintaining code written decades ago by people who no longer work at the company, this readability has value.

Modern languages favor conciseness:

result = [[sum(a[i][k] * b[k][j] for k in range(cols_a))
          for j in range(cols_b)]
         for i in range(rows_a)]

This is more compact, but the COBOL version is more accessible to non-programmers. In financial systems where auditors must review code, COBOL’s verbosity is a feature, not a bug.

What We Lost

Not everything about COBOL deserves preservation. Some features are genuinely bad:

ALTER statement: Allows modifying GO TO targets at runtime, creating dynamic control flow impossible to analyze statically. Thankfully deprecated.

GOTO: While other languages have conditional gotos, COBOL’s unconditional GO TO encourages spaghetti code in large programs.

Lack of local scope: All variables are global within a program. No function-local state, no encapsulation.

Column-sensitive formatting: Early COBOL required specific columns for specific syntax (columns 1-6 for line numbers, 7 for comments, 8-72 for code). GnuCOBOL relaxes this, but legacy code still depends on it.

Limited modularity: COBOL programs can call other programs, but the calling convention is heavyweight. Modern module systems enable much finer-grained code reuse.

These limitations aren’t charming quirks - they’re real obstacles to maintainability. They’re also products of their era: 1959 computing with punched cards and batch processing.

What We Can Recover

The valuable lessons from COBOL aren’t language-specific syntax - they’re design principles:

Explicit representation over inference: Making data layout visible prevents entire classes of bugs
Decimal arithmetic for business logic: Fixed-point math eliminates floating-point errors in financial calculations
Enforced structure: Language-level organization makes large codebases navigable
Explicit error handling: Return codes force acknowledgment of failure cases
Native compilation: Critical paths benefit from machine-code execution
Readable verbosity: Code is read more than written; optimize for comprehension
Simple build systems: Not every project needs complex dependency resolution

Modern languages have rediscovered many of these principles. Rust’s explicit error handling and fine-grained control over data representation echo COBOL’s philosophy. Go’s enforced formatting and explicit error returns are similar. Domain-specific languages for financial modeling often choose fixed-point arithmetic.

The Real Lesson: Context Matters

COBOL seems archaic when judged by modern standards. But it wasn’t designed for modern contexts. It was designed for:

Batch processing of business records
Decimal-based financial calculations
Code maintained by non-CS-degree business analysts
Mainframe systems with expensive compilation
Long-running processes (decades) with minimal changes

In this context, COBOL’s design makes sense. The verbosity aids comprehension for business users. The fixed-point arithmetic prevents financial errors. The rigid structure helps long-term maintenance. The compilation to native code maximizes mainframe efficiency.

Successful languages solve the problems of their era. COBOL succeeded because it matched its context perfectly. It persists because that context (financial calculation) hasn’t fundamentally changed.

When we build modern systems, we should ask: what is our context? What problems are we actually solving? Are we choosing tools that match our context, or tools that are fashionable?

Conclusion

Building a COBOL matrix library in 2025 revealed that COBOL isn’t the enemy of modern software development - premature abstraction, hidden complexity, and mismatched tools are the enemies. COBOL is explicit, rigid, and verbose by design. These qualities made it survive 66 years in production.

Not every lesson from COBOL applies to modern development. We have better tools for modularity, testing, and code reuse. But COBOL’s emphasis on explicit representation, precise arithmetic, and enforced structure deserves recognition.

The real software archaeology isn’t about preserving old code - it’s about understanding why that code made the choices it did, and recognizing when those choices still apply. When you need precise decimal arithmetic, COBOL-style fixed-point numbers are still the answer. When you need explicit error handling, return codes (or Result types) are still valid. When you need code readable by non-programmers, verbosity is a feature.

Legacy systems teach us that good design transcends language syntax. COBOL has lasted this long not because of syntax, but because it solved real problems with appropriate solutions. The languages we build today will be judged by the same standard: not whether they follow current fashion, but whether they solve real problems with appropriate solutions.

That’s the lesson from today’s little archaeology session, to learn what principles endure.

Keyboard Shortcuts

Software Archaeology: What We Can Learn from COBOL