Skip to main content

Markdown Mate for Jupyter Notebooks: automating in-notebook documentation with LLMs

·2 mins

This project is a small automation that enriches an existing Jupyter Notebook by inserting explanatory Markdown cells immediately before every code cell. It preserves the original code unchanged and generates concise human-readable descriptions that explain each code block’s purpose, inputs, outputs, and side effects.

Auto Markdown diagram

Objectives #

  • Local inference: Use a locally downloaded open source LLM for security and privacy.
  • Faster comprehension: Researchers and reviewers can scan a notebook and understand intent without reading every line of code.
  • Improved reproducibility: Inline documentation clarifies side effects like file writes and plotting, making experiments easier to reproduce.
  • Onboarding & teaching: Instructors and new team members get ready-made narrative flow for notebooks.

đź”§ Automation #

  • Loads a notebook JSON (auto_markdown.ipynb).
  • Removes execution artifacts (outputs and execution_count) to avoid leaking runtime state.
  • Constructs a structured prompt for a locally-hosted LLM, requesting a version of the notebook JSON where each code cell is preceded by a short Markdown cell.
  • Calls the model and writes a new notebook file (e.g., auto_markdown_with_md.ipynb) containing the added markdown cells while leaving code unchanged.

Tech stack #

  • Language: Python (Jupyter notebook automation).
  • Notebook format: Standard .ipynb (JSON).
  • LLM client: OpenAI-compatible client pointed at a local Ollama instance (http://localhost:11434/v1).
  • Model: llama3.3 (used locally via Ollama in the example).

Benefits for data research and academia #

  • Paper reproducibility: Reviewers and reproducibility officers appreciate notebooks that tell a clear story; automated explanations reduce friction when sharing experimental artifacts.
  • Faster review cycles: Small teams can supply notebooks with embedded explanations that speed up peer review and code audits.
  • Pedagogy: Instructors can programmatically generate annotated notebooks for exercises and examples.
  • Accessible documentation: The automation produces consistent, plain-language descriptions that help non-experts understand computational workflows.

🛡️ Limitations & best practices #

  • The quality of generated explanations depends on the model—validate domain-specific claims (especially for scientific computations).
  • Use the output as a draft: human review is recommended for correctness and contextual clarity.

đź“‚ Source code #

Project notebook: static/source_code/auto_markdown.ipynb (in this blog repo: https://github.com/ZombieDadCoding/debugging-the-matrix/).