CAR-T ยท Endometriosis ยท In Silico Discovery

Bloomsbury Burger
Therapeutics

We are building a dual-target CAR-T therapy guided by single-cell genomics. This is not hormone modulation. Not symptom management. This is targeted cellular ablation of ectopic lesions โ€” informed by a 54-patient transcriptomic atlas and a differentiable optimisation formulation we believe is novel.

Atlas loaded
54/54 samples processed
Optimisation: converged
Pair stability: 0.92
โ“

Why endometriosis, why now

Endometriosis affects ~10% of people with a uterus worldwide. The current standard of care is surgical resection โ€” cutting out the lesions โ€” which has a high recurrence rate and is imprecise. Surgeons literally cannot reliably distinguish all pathological tissue from healthy tissue by visual inspection alone.

The field is chronically underfunded. A 1977 FDA mandate excluded women from early clinical trials and the field has been playing catch-up ever since. Femtech is currently 80% cycle trackers and 20% rebranded thermometers.

We think a targeted cellular therapy, informed by single-cell genomics, could be a genuinely better solution. CAR-T has already transformed blood cancer treatment. We're asking: what if we applied the same logic to solid tissue pathology?

10%
Prevalence
Global prevalence among people with a uterus. Comparable to diabetes. Receives a fraction of the research funding.
50%
Recurrence rate
Post-surgical recurrence within 5 years. Visual resection misses microscopic pathological deposits.
0
Targeted therapies
Zero molecularly-targeted therapies currently approved. The field remains stuck in hormone suppression paradigms.
โš™๏ธ

Technical pipeline

01

Load & QC single-cell data

Load raw Cell Ranger output (raw_feature_bc_matrix.h5) for each sample. Filter out empty droplets and dying cells using gene count and mitochondrial read thresholds. Save filtered cells as .h5ad files.

02

Denoise with scVI-VAE

Single-cell data is noisy โ€” genes drop out randomly. We run scVI (a variational autoencoder) to learn a clean latent representation of each cell's true expression profile. This enables stable binarisation for downstream combinatorial optimisation.

03

Differentiable marker optimisation

Instead of brute-force enumeration of ~4.5 million possible marker pairs, we relax the discrete combinatorial search into continuous space using a Gumbel-Softmax reparameterisation, optimise with gradient descent, then snap back to a hard discrete selection. We believe this specific formulation โ€” combining a biomedical specificity objective with a whole-body safety penalty in a jointly differentiable system โ€” is novel. โ†’ See full mathematics

04

Cross-reference Tabula Sapiens

Any surviving marker pairs get checked against the full human cell atlas. Infinite penalty for expression in heart, lung, brain or other critical organs. Only pairs that are truly lesion-specific survive this filter.

05

Output top candidate pairs

Gradient descent outputs a ranked top-5 list of dual marker combinations. These get handed off for benchtop feasibility assessment โ€” checking whether scFvs (the targeting domains) exist or can be designed for each candidate.

๐Ÿ—„๏ธ

Dataset โ€” GSE213216

We're using the Human Endometriosis Cell Atlas published in Nature Genetics (2024). It contains single-cell RNA sequencing data from 54 patient samples across multiple tissue types: ectopic endometrial lesions (endometriomas + peritoneal lesions), eutopic endometrium, and unaffected control tissue.

Total size: ~15.7GB. Format: Cell Ranger output (10x Genomics). Each sample loads as a barcodes ร— genes matrix. Integration across samples is performed after QC and batch-aware modelling with scVI.

54 samples ~15.7 GB 10x Genomics scRNA-seq Cell Ranger output Ectopic lesions Eutopic endometrium Control tissue Nature Genetics 2024
๐Ÿ“š

Key resources

๐Ÿ“–

Glossary โ€” for the new people

What is a CAR-T cell? โ–ผ
CAR-T stands for Chimeric Antigen Receptor T-cell. It's a type of immune cell (T-cell) that we genetically engineer to express a synthetic receptor (the CAR) on its surface. This receptor recognises specific proteins on target cells and triggers the T-cell to destroy them. It's already used in blood cancers and has transformed treatment for some patients.
What is scRNA-seq? โ–ผ
Single-cell RNA sequencing. It lets us measure which genes are switched on or off in thousands of individual cells simultaneously. Traditional RNA-seq gives you an average across millions of cells โ€” scRNA-seq gives you a snapshot of every single cell independently. This is how we can identify what makes ectopic endometrial cells different from healthy ones.
What's ectopic vs eutopic endometrium? โ–ผ
Eutopic endometrium is the normal uterine lining โ€” where it's supposed to be. Ectopic means "out of place" โ€” in endometriosis, endometrial-like tissue grows outside the uterus (on the ovaries, peritoneum, fallopian tubes, etc). These lesions are what we want to target with the CAR-T. Our targets need to be present in ectopic tissue but NOT in eutopic tissue or healthy organs.
What is scVI? โ–ผ
scVI is a deep learning model (variational autoencoder) designed specifically for single-cell data. Single-cell measurements are very noisy โ€” genes randomly drop out and counts are sparse. scVI learns a cleaner latent representation of each cell's true biology by modelling the technical noise explicitly. We use it to denoise the data before doing marker discovery.
What is Tabula Sapiens? โ–ผ
A whole-human cell atlas โ€” essentially a map of gene expression across every major cell type in the human body. We use it as a safety filter. If a candidate marker is also expressed in heart cells, lung cells, or neurons, we throw it out immediately โ€” we don't want the CAR-T attacking those.
Why dual targeting? โ–ผ
A CAR-T that targets one marker is risky โ€” if that marker appears anywhere else in the body, you get off-target toxicity. A dual-target CAR only activates when BOTH markers are present simultaneously (like an AND gate). This is far more specific, because the probability of two markers co-occurring in healthy tissue is much lower than either one alone. J&J recently showed this works in humans.
What's Gumbel-Softmax and why does it matter? โ–ผ
The marker selection problem is discrete: you either include a gene or you don't. Gradient descent requires a continuous, differentiable objective. The Gumbel-Softmax trick lets us relax discrete categorical choices into soft continuous probability vectors, run gradient descent, then anneal the temperature back to recover a hard discrete selection. This makes searching the ~4.5M pair space tractable. Full math here โ†’