# UMAP: Uniform Manifold Approximation and Projection for Dimension Reduction¶

Uniform Manifold Approximation and Projection (UMAP) is a dimension reduction technique that can be used for visualisation similarly to t-SNE, but also for general non-linear dimension reduction. The algorithm is founded on three assumptions about the data

- The data is uniformly distributed on Riemannian manifold;
- The Riemannian metric is locally constant (or can be approximated as such);
- The manifold is locally connected.

From these assumptions it is possible to model the manifold with a fuzzy topological structure. The embedding is found by searching for a low dimensional projection of the data that has the closest possible equivalent fuzzy topological structure.

The details for the underlying mathematics can be found in our paper on ArXiv:

McInnes, L, Healy, J, *UMAP: Uniform Manifold Approximation and Projection
for Dimension Reduction*, ArXiv e-prints 1802.03426, 2018

You can find the software on github.

**Installation**

Conda install, via the excellent work of the conda-forge team:

```
conda install -c conda-forge umap-learn
```

The conda-forge packages are available for linux, OS X, and Windows 64 bit.

PyPI install, presuming you have numba and sklearn and all its requirements (numpy and scipy) installed:

```
pip install umap-learn
```

- How to Use UMAP
- Basic UMAP Parameters
- Transforming New Data with UMAP
- UMAP for Supervised Dimension Reduction and Metric Learning
- Using UMAP for Clustering
- Gallery of Examples of UMAP usage
- Frequently Asked Questions
- Should I normalise my features?
- Can I cluster the results of UMAP?
- The clusters are all squashed together and I can’t see internal structure
- Is there GPU or multicore-CPU support?
- Can I add a custom loss function?
- Is the support for the R language?
- Is there a C/C++ implementation?
- I can’t get UMAP to run properly!
- What is the difference between UMAP / VAEs / PCA?
- Successful use-cases