churnsense

ChurnSense — AI Customer Churn Predictor

A mobile-first progressive web app that predicts customer churn risk using a machine learning pipeline trained on simulated behavioural data.

Live model: Logistic Regression · AUC 0.79 · 9 features · 3,000 training samples


Features


Tech Stack

Layer Technology
Backend Python 3.12, Flask 3.1.3
ML scikit-learn 1.6.1 (Pipeline, ColumnTransformer, LogisticRegression)
Data pandas 2.2.3, numpy 2.2.3
Frontend Vanilla HTML/CSS/JS (no framework)
Deployment Vercel (@vercel/python)
Production server Gunicorn 23.0.0

Input Features

Feature Type Range / Options
Gender Categorical Female, Male
Country Categorical Canada, Egypt, Germany, India, UK, USA
SubscriptionType Categorical Basic, Free, Premium
ContractType Categorical Monthly, Annual
TenureMonths Numeric 1 – 72 months
MonthlyCharge Numeric $0 – $100
SupportTickets Numeric 0 – 8
DaysSinceLastLogin Numeric 0 – 365 days
PaymentFailures Numeric 0 – 5

Project Structure

churnsense/
├── app.py                        # Flask backend — /predict, /health, /
├── generate_data.py              # Dataset generation + model training script
├── model.pkl                     # Trained sklearn Pipeline (serialised)
├── model_meta.json               # Feature metadata (categories, ranges, model name)
├── requirements.txt              # Python dependencies
├── vercel.json                   # Vercel deployment config
├── .gitignore
├── templates/
│   └── index.html                # Single-file PWA frontend
├── static/
│   ├── manifest.json             # PWA manifest
│   ├── sw.js                     # Service worker
│   ├── icon-192.png
│   └── icon-512.png
└── Simulated Churn Data/
    └── Simulated Customer Data.csv   # 3,000-row training dataset

API

GET /health

Returns model status.

{ "status": "ok", "model": "Logistic Regression" }

POST /predict

Accepts a JSON body with all 9 features.

Request

{
  "Gender": "Male",
  "Country": "India",
  "SubscriptionType": "Premium",
  "ContractType": "Monthly",
  "TenureMonths": 6,
  "MonthlyCharge": 75,
  "SupportTickets": 3,
  "DaysSinceLastLogin": 45,
  "PaymentFailures": 2
}

Response

{ "churn": 1, "probability": 97.9 }

churn is 1 (will churn) or 0 (will stay). probability is the churn likelihood as a percentage (0–100).


Running Locally

# 1. Clone
git clone https://github.com/RamyKhairy24/churnsense.git
cd churnsense

# 2. Create and activate a virtual environment
python -m venv .venv
source .venv/bin/activate        # Windows: .venv\Scripts\activate

# 3. Install dependencies
pip install -r requirements.txt

# 4. (Optional) Regenerate the dataset and retrain the model
python generate_data.py

# 5. Start the dev server
python app.py

Open http://localhost:5000 in your browser.

Environment variables

Variable Default Description
SECRET_KEY random (auto-generated) Flask session secret
PORT 5000 HTTP port
FLASK_DEBUG false Enable debug mode (true / false)

Deployment on Vercel

Deploy with Vercel

The repository includes a vercel.json that routes all traffic through app.py using the @vercel/python builder. No extra configuration is needed — just connect the repo in the Vercel dashboard and deploy.

Set the SECRET_KEY environment variable in the Vercel project settings for production.


Model Training

generate_data.py builds the full ML pipeline:

  1. Generates 3,000 synthetic customer records with a logistic churn model
  2. Compares three classifiers via 5-fold stratified CV: Logistic Regression, Random Forest, Gradient Boosting
  3. Retrains the best model on the full dataset
  4. Saves model.pkl and model_meta.json
Model CV AUC Test AUC
Logistic Regression 0.7730 ± 0.017 0.7905
Gradient Boosting 0.7437 ± 0.018 0.7588
Random Forest 0.7417 ± 0.019 0.7426

License

MIT