Brain Tumor Classifier — Deep Learning for Medical Imaging

TensorFlowKerasPythonStreamlitXceptionGoogle APINgrok

Problem

Manual MRI analysis is time-consuming and requires expert radiologists, creating bottlenecks in diagnosis.

Solution

Interactive deep learning classifier with explainable AI for automated brain tumor detection across 4 categories.

Key Impact

•Dual model comparison: CNN vs Xception architecture
•4-class classification: Glioma, Meningioma, Pituitary, No tumor
•Interactive Streamlit web app with real-time predictions
•Explainable AI with saliency maps and confidence scores

Live Demo

Try the application directly below. Upload an MRI scan or use the sample images to see the dual model predictions, saliency maps, and AI-generated explanations in action.

Note: The Streamlit demo may take a minute to load on first visit as the app spins up. If the demo appears blank, please wait or refresh the page.

The Problem

Radiologists are the bottleneck in brain tumor diagnosis. Manual MRI analysis takes 20-30 minutes per scan, requires expensive specialists, and scales poorly with growing demand. Delays in diagnosis directly impact patient outcomes—early detection saves lives.

The real problem isn't just accuracy—it's trust. Black-box AI models don't work in healthcare. Clinicians need to understand why a model makes predictions before they'll act on them.

The Solution

I built an explainable AI system that classifies brain tumors (Glioma, Meningioma, Pituitary, No Tumor) with transparent reasoning. The product solves two problems at once:

Speed: Reduces diagnosis time from 25 minutes to seconds
Trust: Shows exactly which image regions influenced predictions through saliency maps

Key product decision: Dual model architecture (CNN + Xception). When both models agree, confidence is high. When they disagree, it flags uncertainty—critical for medical decisions.

Product Decisions & Technical Architecture

Why dual models? Trust through consensus. When CNN and Xception agree → high confidence. When they disagree → flag for human review. This design choice prioritizes safety over raw accuracy.

Engineering detail: Custom CNN uses 3 convolutional blocks with batch normalization and dropout (0.3) for regularization. Xception is pre-trained on ImageNet, then fine-tuned on medical imaging with frozen early layers and trainable classifier head. Comparing both architectures validates that learned features are robust, not dataset-specific artifacts.

Why saliency maps? Clinicians won't use tools they don't understand. Heatmaps show which pixels influenced predictions—turning AI from black box to transparent diagnostic assistant.

Engineering detail: Implemented gradient-based visualization using TensorFlow's GradientTape to compute gradients of class scores with respect to input images. The resulting saliency map highlights regions with highest gradient magnitude—showing which anatomical features drove the prediction.

Why Streamlit? Speed to user. Instead of building custom UI from scratch, Streamlit got an interactive demo in users' hands within weeks, not months.

Engineering detail: Streamlit's reactive programming model lets UI updates happen automatically when users interact with widgets. Models are loaded once using @st.cache_resource to avoid reloading on every prediction. Image preprocessing (resize to 224×224, normalize) happens in-memory for sub-second inference.

Why Kaggle dataset? Publicly available, well-labeled data accelerates development. Real clinical validation comes later—ship fast, validate with experts iteratively.

Engineering detail: Dataset split 70/15/15 (train/val/test) with stratified sampling to maintain class balance. Applied heavy data augmentation: rotation (±15°), horizontal flip, zoom (0.8-1.2×), brightness adjustment (±20%) to artificially expand training set and prevent overfitting.

Tech Stack

Models: Custom CNN (3 conv blocks, batch norm, dropout) + Xception (ImageNet transfer learning)
Framework: TensorFlow 2.x + Keras API for model development
Explainability: Gradient-based saliency maps (GradientTape) + Google Gemini API for natural language explanations
Interface: Streamlit with cached model loading for fast inference
Dataset: 5K+ MRI images (Kaggle), augmented 5× for 25K effective training samples

How It Works

User Flow (Designed for Zero Friction)

Upload → Drag MRI scan or use sample images (no signup required)
Select Model → Choose CNN, Xception, or compare both
Get Results → Instant prediction with confidence scores
Understand Why → Saliency map highlights influential regions + AI explanation in plain English

Product Philosophy: Every feature exists to build trust. No black boxes, no jargon users can't understand.

Development Journey

Week 1-2: Proof of Concept Built custom CNN on Kaggle dataset. Goal: Validate that deep learning could classify tumor types with reasonable accuracy.

Week 3-4: Add Validation Layer Integrated Xception for comparison. Key insight: Disagreement between models is valuable signal—it means "review needed."

Week 5-6: Make It Explainable Added saliency maps and Google API explanations. This transformed the product from "AI tool" to "diagnostic assistant." Clinicians trust what they can see.

Week 7-8: Ship to Users Deployed Streamlit app publicly. Real user feedback revealed: Sample images crucial (most users don't have MRI scans handy). Added 4 samples covering all tumor types.

Impact

Problem Solved: Turned 25-minute manual MRI analysis into instant classification with explainable reasoning.

User Value Delivered:

✅ Zero barrier to entry: No signup, no installation—just upload and predict
✅ Trust through transparency: Saliency maps answer "why this prediction?"
✅ Built-in validation: Dual models catch edge cases where one model might fail
✅ Accessible language: AI explanations in plain English, not medical jargon

Real-World Validation:

Deployed publicly at brain-mri-classification-2024.streamlit.app
Sample images let anyone test immediately (critical for demos/education)
Handles all 4 tumor types with interpretable confidence scores

Key Product Challenges

Challenge 1: Trust is Harder Than Accuracy

The Problem: A 95% accurate model is worthless if doctors won't use it. Healthcare demands explainability.

The Solution: Prioritized interpretability over marginal accuracy gains. Saliency maps show which pixels mattered. Google API converts technical outputs to human language. Result: A tool that builds trust, not just predictions.

Challenge 2: Validation Without Clinical Access

The Problem: Can't validate with real clinicians during early development.

The Solution: Dual model architecture creates self-validation. When models agree → likely correct. When they disagree → flag uncertainty. This design lets the product validate itself until real clinical testing is feasible.

Challenge 3: Overfitting on Limited Data

The Problem: Medical imaging datasets are small. Models memorize instead of learning.

The Solution: Heavy data augmentation (rotation, flipping, scaling) + transfer learning (Xception's ImageNet knowledge). Custom CNN validates that features are real, not artifacts.

Engineering approach: Implemented Keras ImageDataGenerator with aggressive augmentation pipeline. Training loss converged to 0.15 while validation loss stabilized at 0.28—healthy gap indicating generalization, not overfitting. Also used dropout (0.3) after dense layers and early stopping (patience=10 epochs) to prevent over-training. Transfer learning with Xception brought validation accuracy from 87% → 93% with same dataset size.

Key Product Insights

1. In healthcare AI, trust > accuracy A 90% accurate model with explanations beats a 95% accurate black box. Clinicians won't act on predictions they don't understand. Saliency maps weren't a "nice-to-have"—they were the feature that made the product viable.

2. Distribution is part of product Streamlit deployment wasn't just "launch strategy"—it was core product. The URL is the product. No app store approval, no installation friction, no barrier between user and value.

3. Ship fast, validate with users 8 weeks from idea to public deployment. Early user feedback (like "add sample images") shaped the product more than any internal planning could have. Product-market fit comes from iteration, not perfection.

What Could Be Further Improved

Short-term

Batch Upload: Process multiple scans at once instead of one-by-one
Export Reports: Generate PDF summaries with predictions + saliency maps
DICOM Support: Integrate with hospital imaging systems (currently requires JPEG conversion)
Audit Trail: Log every prediction for regulatory compliance

Long-term

3D Volumetric Analysis: Analyze full MRI volumes, not just 2D slices
Federated Learning: Train on hospital data without centralizing sensitive patient information
Edge Deployment: Run on hospital servers for faster inference and data privacy
Expand Tumor Types: Cover rarer variants beyond the initial 4 categories

Stack: TensorFlow 2.x, Keras, Python 3.10, Streamlit, Xception, Google Gemini API

Timeline: 8 weeks (Jan-Mar 2024)

Dataset: Kaggle "Brain MRI Images for Brain Tumor Detection" (5K images → 25K with augmentation)

Model Performance:

Custom CNN: 90% accuracy, 0.28 val loss
Xception (transfer learning): 93% accuracy, 0.22 val loss
Inference time: 0.3s per prediction
Model agreement: 92% (disagreement flags edge cases)

Deployment: Live at brain-mri-classification-2024.streamlit.app

Key Metric: 25-minute manual analysis → 0.3-second automated classification with explainable reasoning