This is a compact workflow reference for reversing small binaries: preserve C semantics in Python, drive local targets reproducibly, parse binary layouts explicitly, and inspect ELF shared-object boundaries.

For vectorized loops, packed math, and compiler intrinsics, use the related Intrinsics note.

C Semantics In Python

When porting C logic, preserve the behavior that Python does not model by default.

C concernPython handling
Fixed-width overflowMask after operations with x & ((1 << bits) - 1).
SignednessConvert at input/output boundaries, not only at the end.
Unsigned right shiftMask first, then shift.
Integer divisionUse int(a / b) for C99 truncation toward zero.
PromotionsRe-apply width after operations where the C type matters.
switch fallthroughModel explicitly; Python match does not fall through.
PointersUse offsets into bytearray, memoryview, ctypes, or mmap.
C stringsSlice bytes until \0; Python bytes do not stop there.
Struct layoutUse explicit struct formats or ctypes.Structure.
UnionsReuse the same bytes with ctypes.Union or multiple unpack views.
BitfieldsExtract with masks and shifts; compiler packing can differ.
Float widthPython float is usually C double; use struct.pack("<f", ...) or numpy.float32 for 32-bit behavior.

Use these helpers when the exact integer width matters:

from ctypes import c_int32, c_uint32

def to_i32(x: int) -> int:
    return c_int32(x).value

def to_u32(x: int) -> int:
    return c_uint32(x).value

def unsigned(x: int, bits: int = 32) -> int:
    return x & ((1 << bits) - 1)

def signed(x: int, bits: int = 32) -> int:
    x &= (1 << bits) - 1
    sign = 1 << (bits - 1)
    return x - (1 << bits) if x & sign else x

def c_div(a: int, b: int) -> int:
    return int(a / b)

def c_mod(a: int, b: int) -> int:
    return a - c_div(a, b) * b

def shl(x: int, n: int, bits: int = 32) -> int:
    return unsigned(x << n, bits)

def shr_u(x: int, n: int, bits: int = 32) -> int:
    return unsigned(x, bits) >> n

def shr_s(x: int, n: int, bits: int = 32) -> int:
    return signed(x, bits) >> n

def hx(x: int, bits: int = 32) -> str:
    return f"0x{unsigned(x, bits):0{bits // 4}X}"

Binary layout helpers:

import struct

def p32(x: int) -> bytes:
    return struct.pack("<I", unsigned(x, 32))

def u32(buf: bytes, off: int = 0) -> int:
    return struct.unpack_from("<I", buf, off)[0]

def cstr(buf: bytes, off: int = 0) -> bytes:
    end = buf.find(b"\0", off)
    return buf[off:] if end < 0 else buf[off:end]

Before trusting a port, compare intermediate values against the original binary or a tiny C harness. Test 0, 1, -1, max signed, min signed, high-bit set values, short buffers, and null bytes.

Python Process Harness

A harness should make the run reproducible: exact path, exact input, exact environment, captured stdout/stderr, return code, and timeout.

Minimal runner:

from pathlib import Path
import subprocess

target = Path("./program").resolve()

def run(payload: bytes = b"", args=(), timeout: float = 5, env=None):
    proc = subprocess.run(
        [str(target), *args],
        input=payload,
        stdout=subprocess.PIPE,
        stderr=subprocess.PIPE,
        timeout=timeout,
        check=False,
        env=env,
    )
    return proc.returncode, proc.stdout, proc.stderr

code, out, err = run(b"A" * 24 + b"\n")
print(out.decode(errors="replace"))
print(err.decode(errors="replace"))
print("returncode:", code)

Make environment changes local to the harness:

import os

env = os.environ.copy()
env["LC_ALL"] = "C"
env["PYTHONHASHSEED"] = "0"

code, out, err = run(b"test\n", env=env)
assert code in (0, 1)

Use subprocess.Popen only when the target needs staged interaction:

proc = subprocess.Popen(
    [str(target)],
    stdin=subprocess.PIPE,
    stdout=subprocess.PIPE,
    stderr=subprocess.PIPE,
)

out, err = proc.communicate(b"input\n", timeout=5)
print(proc.returncode, out, err)

For local preload experiments, keep the hook path explicit:

from pathlib import Path
import os

root = Path("./scratch").resolve()
target = root / "program"
hook = root / "tap.so"

env = os.environ.copy()
env["LD_PRELOAD"] = str(hook)

Then pass env into the runner. If the hook changes behavior, the harness should make the change visible in stdout, stderr, return code, or a saved trace.

ELF And Shared Objects

Changing an ELF type field does not make an arbitrary executable behave like a shared library. Prefer one of these workflows:

  • Inspect the binary to understand type, imports, exports, relocations, and entry points.
  • Patch metadata on an actual shared object.
  • Build a wrapper from source or a relocatable object.
  • Load a real shared object from Python with an explicit ABI.

Inspect first:

file ./target
readelf -h -l -d ./target
readelf -Ws ./target | less
objdump -d -M intel ./target | less
objdump -T ./libtarget.so | less

Patch metadata only when the input is already a suitable shared object:

patchelf --print-soname ./libtarget.so
patchelf --set-soname libtarget.so ./libtarget.so
patchelf --print-rpath ./libtarget.so
patchelf --set-rpath '$ORIGIN' ./libtarget.so

When a relocatable object is available, build a wrapper:

/* wrapper.c */
extern int main(int, char **);

int target_main(int argc, char **argv) {
    return main(argc, argv);
}
gcc -fPIC -shared -o wrapper.so wrapper.c target.o

Load it from Python:

import ctypes

lib = ctypes.CDLL("./wrapper.so")
lib.target_main.argtypes = [ctypes.c_int, ctypes.POINTER(ctypes.c_char_p)]
lib.target_main.restype = ctypes.c_int

Use ctypes only after defining argtypes and restype. Incorrect ABI assumptions can turn a small experiment into misleading output.

References