pixel-seq protocol

Pixel-seq Assay

Transcript capturing and cDNA synthesis

A polony gel slide was disassembled from the flowcell, washed with milli-Q water and 3 × 200 μL wash buffer 1 (0.1 × SSC and 0.4 × Maxima H Minus RT buffer (Thermo Fisher, EP0753)), and dried in the PCR hood before use. For cryosectioning, a frozen tissue block was equilibrated at -20 ̊C in a Cryostat NX70 (Thermo Scientific) for 15 min, mounted onto a holder with O.C.T. (Fisher Healthcare, 4585), and sliced to 10-µm sections. Immediately after placing tissue sections on printed array positions on the dry gel surface, 50 μL tissue hybridization buffer (6 × SSC and 2 U/µL RNAseOUT (Thermo Fisher, 10777019)) was gently applied to immerse the sections and then the gel was incubated at R.T. for 15 min. After the hybridization, the buffer was removed by pipetting and the polony gel slide was assembled into multi-well reaction chambers (ProPlate; Grace Bio-Labs, 246868). cDNAs were synthesized by adding 100 µL reverse transcription (RT) mixture (5 µL Maxima H- reverse transcriptase (Thermo Fisher, EP0753), 20 µL 5 × Maxima RT buffer, 20 µL 20% Ficoll PM-400 (Sigma-Aldrich, F4375), 10 µL 10 mM each dNTPs, 5 µL 50 µM template-switching oligo (Qiagen, 339414YCO0076714), 2.5 µL RNAseOUT (40 U/µL), and 37.5 µL H2O) into each well and incubating the chambers at 42 ̊C for 1 hour. A Cy5-dCTP-labeled cDNA assay was performed at the same condition except replacing the dNTPs with 500 µM each dATP/dGTP/dTTP, 12.5 µM dCTP, and 25 µM Cy5-dCTP (PerkinElmer, NEL577001EA).

Tissue cleanup

After the cDNA synthesis, the reaction buffer was removed, and the tissues were washed by 3 × 200 µL 0.1 × SSC. 100 µL proteinase K digestion solution (10 µL proteinase K (Qiagen, 19131) in 90 µL PKD buffer (Qiagen, 1034963)) was added to each chamber and incubated at 55 ̊C for 30 min. To remove digested proteins, genomic DNAs, and others, each chamber was washed with 3 × 200 µL elution buffer 2 (2 × SSC and 0.1% SDS) and 3 × 200 µL wash buffer 2 (0.1 × SSC).

Sequencing library construction

Recovering spatially barcoded cDNAs from the gel and introducing unique molecular identifiers (UMIs) into cDNAs were achieved by the second-strand synthesis and primer extension, respectively. For example, 70 µL second-strand mix (7 µL 10 × isothermal amplification buffer (NEB, B0357), 7 µL 10 mM each dNTP mix, 3.5 µL 10 µM TSO primer, 0.5 µL 20 mg/mL BSA (NEB, B9000), 3 µL BST2.0 WarmStart DNA Polymerase (NEB, M0538), and 49 µL H2O) was added to each chamber and the chambers were sealed with a sealing film. After incubating the chambers at 65 ̊C for 15 min, the reagent was removed, and the gel was washed by 3 × 200 µL wash buffer 2. To elute the DNAs, a denature and elution mix (35 µL 0.05 M KOH) was added to each chamber, incubated at R.T. for 10 min, and neutralized by 5 µL Tris (1 M, pH 7.0). ~35 µL sample was transferred from each chamber into a tube and added with 65 µL UMI incorporation mix (50 µL 2 × Q5 Ultra II master mix (NEB, M0544), 2.5 µL 10 µM UMI primer, and 12.5 µL H2O). The UMI incorporation was performed by denaturing at 95 ̊C for 30 seconds, annealing at 65 ̊C for 30 seconds, and extension at 72 ̊C for 5 min in a PCR machine. Non-incorporated primer was removed by incubating the sample with 1 µL thermolabile exonuclease I (20 U; NEB, M0568) at 37 ̊C for 4 min and inactivating the exonuclease at 80 ̊C for 1 min. To amplify the cDNA library, the tube was added with 2 µL each 10 µM TSO and TruSeq sequencing primers and 1 µL Q5 HotStart polymerase (NEB M0493) for PCR amplification to obtain 5-10 ng DNA per sample. PCR reactions were performed as follows: annealing at 95 ̊C for 3 min, 4 amplification cycles each including 98 ̊C for 20 seconds, 65 ̊C for 45 seconds, and 72 ̊C for 3 min, 8 amplification cycles each including 98 ̊C for 20 seconds, 67 ̊C for 20 seconds, and 72 ̊C for 3 min, and a final incubation at 72 ̊C for 5 min. Typically, after the amplification, ~1 ng DNA was used for sequencing library construction with a Nextera XT kit (Illumina FC-131-1024) and the TruSeq LibP5 primer, and Nextera index primers following the manufacturer's protocol. From tissue cryosectioning to the completion of the sequencing library construction, the Pixel-seq assay took ~6 hours.

Polony Sequencing

To determine barcode sequences and distribution, the gel stamping was performed with a stamp and a copy gel on a round coverslip as described above. The coverslip was assembled into a FCS2 flowcell for polony sequencing. TaqI-cleaved polonies were hybridized with a polony sequencing primer following the above described template hybridization protocol and then sequenced with a HiSeq SBS kit v4 (Illumina, FC-401-4002). Images were acquired using a Nikon Ti-E automated inverted epifluorescence microscope equipped with a perfect focus system, a Nikon CFI60 Plan Fluor 40 ×/1.3-NA oil immersion objective, a linear encoded motorized stage (Nikon Ti-S-ER), and an Andor iXon Ultra 888 EMCCD camera (16-bit dynamic range, 1,024 × 1,024 array with 13-μm pixels). A four-channel imaging setup comprised two laser lines (Laser Quantum GEM 532 nm (500 mW) and Melles Griot 85-RCA-400 660 nm (400 mWS)) and two filter cubes with an emission filter (610/60-730/60 or 555/40-685/20; Chroma Technology) and a 532/660 dichroic mirror (Chroma Technology). Sequencing regents were added to the flowcell by a fluidic system comprising a multi-position microelectric valve (Valco Instruments EMH2CA) and a multi-channel syringe drive pump (Kloehn V6 12K). The sequencing was automated by building an application in Java 1.6 using jSerialComm (http://fazecast.github.io/jSerialComm/) to control the fluidic system and Micro-Manager v1.4.22 (https://www.micro-manager.org/) to acquire images from selected stage positions. The sequencing was performed with reagents provided in the HiSeq SBS kit following a standard HiSeq sequencing protocol. Each sequencing cycle includes i) pre-cleavage wash with a cleavage buffer, ii) dye and protection group removal by a cleavage mix, iii) post-cleavage wash with a high salt buffer, iv) a pre-incorporation wash with an incorporation buffer, v) incorporation with an incorporation mix, and vi) imaging acquisition in a scan mix. Sequencing 24-bp barcodes on a gel of 22 × 9 mm2 on our platform took ~35 hours.

Data Analysis

Raw sequencing tiff images were processed to extract intensities by Dlight, a custom-built suite in MATLAB. All images were registered to the merged images of the Cy3 and Cy5 channels from the first sequencing cycle using the imregcorr function. Next, a polony reference map was generated from images of the first eight sequencing cycles, termed template cycles. Polonies were identified by searching local signal thresholds (> median + 2 × standard deviation) in all template cycle images and then finding polony centroids with an optimal chastity value. A PhiX control library (Illumina FC-110-3001) was used to optimize Dlight parameters. Intensity values of polony centroids and image pixels were analyzed by a 3Dec base-caller (Wang et al., 2017) allowing the correction of the signal crosstalk between adjacent polonies. To find polony boundaries, unassigned image pixels were compared with spatial barcode-assigned polony centroids within a distance less than 5 pixels. These pixels were assigned with adjacent assigned spatial barcodes with the highest signal correlation coefficient above 0.7. A final spatial barcode map was constructed by combining all sequenced gel positions into a single image.

Transcript mapping

After cDNA library sequencing, FASTQ files were processed to map transcripts to the spatial barcode map. Spatial barcode and UMI sequences were first extracted by Flexbar (Roehr et al., 2017). The index sequences were mapped back to the spatial barcode map by Bowtie (Langmead et al., 2009) allowing up to 2 mismatches. Paired-end reads of mapped indices were aligned to mouse transcriptome (GRCm38) using STARv2.7.0 (Dobin et al., 2013) with a default setting. Sequencing reads with the same transcriptome mapping locus, UMI, and spatial barcode were collapsed to unique records for subsequent analysis.

A method, volume-distance-based segmentation of mapped transcripts (V-seg) implemented in R, was developed to aggregate feature-level UMIs to single cells. V-seg first used mapped transcript data to construct a rNN-network (nearest neighbors within a defined radius, r) by connecting spatial barcodes as nodes. The network was constructed using a cutoff edge distance of 2 µm to ensure every index was connected to at least one different barcode and also decrease edge or connection redundancy.

Next, the edge-weighted network was taken as an input for the graph-based segmentation with a Fast Greedy algorithm using an iGraph R package. Because the Fast Greedy algorithm is sensitive to the network size, to maximize a modularity score, it is necessary to optimize segmentation parameters for different tissues. In case of the OB data in Figure 2, the edge threshold was the weight < 3 and the spatial transcript map was divided to images of 800 × 800 pixels (1 pixel = 0.325 µm) for individual processing before stitching them together. In the PBN analysis in Figures 3 and 4, the segmentation was performed in two steps to separately process aggregated Calca cells and other cell types. Step 1: the edge threshold was the weight < 4 and the transcript map was divided to images of 600 × 600 pixels. Segmented cells were clustered by Seurat to separate Calca cells. Step 2: Other cells were re-segmented using the edge threshold of the weight < 4 and the image size of 800 × 800 pixels.

Clustering analysis

SCTransform normalization was applied in Seurat using 3,000 highly variable genes. After Principal Component Analysis (PCA), UMAP visualization, and the clustering graph was generated with a resolution of 0.8.


scRNA-seq-guided annotation

Segmented cells were annotated with a published dataset (e.g., OB scRNA-seq dataset (GSE121891) (Tepe et al., 2018)) as a reference to predict cell type compositions using scvi-tools (Gayoso et al., 2021). The top 3,000 variable genes were selected for the model training. Raw gene counts for training and testing datasets were scaled to 104. The training and prediction were run at max_epoches = 50. Cell types of segmented cells were determined by the predicted cell types with the highest ratios.


Differential abundance analysis

The differential abundance analysis were analyzed by DA-seq (Zhao et al., 2021) to identify differences between cell distributions in the UMAP space for different conditions.


Spatial pattern gene analysis

To identify transcript distributions showing different spatial patterns in different conditions, the cell-gene matrix and cell spatial coordinates were processed by Giotto (Dries et al., 2021). The analysis used a default setting (e.g., a minimum detected genes of 64 per cell and a scale factor of 6,400) to find spatially patterned genes by a “ranking” method. The top 200 spatially patterned genes were chosen for a GO enrichment analysis (Kuleshov et al., 2016).


PCCF statistic

We quantitatively compared cell contacts (or colocalization) between the same or different cell types using a pairwise cross-correlation function (PCCF) statistic which was previously applied to analyze polony colocalization (Gu et al., 2014). Here, to analyze cell colocalization, PCCF was defined as the ratio of a detected number of colocalization events to a random colocalization possibility. The random colocalization was simulated by a simplified model, where irregular cells were represented as round shapes with original sizes. After randomizing original cell positions, touched and overlapped cell masks were counted; 100 simulations were performed to calculate the average colocalization count as the random colocalization possibility.


Pseudotime analysis

The pseudotime analysis was performed by Slingshot (Street et al., 2018) with parameters of stretch = 2 and thresh = 0.001. The P-value of the pseudotime values for the different conditions was calculated with a Kolmogorov-Smirnov test.


Cell-cell communication

The spatial profiling of signaling likelihoods of individual cells as senders and receivers was performed by SpaOTsc with default parameters (Cang and Nie, 2020). For cluster-to-cluster communication. A Kolmogorov-Smirnov test was used to compare cluster-level signaling likelihoods for paired ligand and receptor(s) genes between different conditions.