Be careful of the poly-G sequence from NextSeq run
We may occasionally find unexpected amount of poly-G reads in raw data generated from NextSeq. See FASTQC figure below as a typical example. In the figure of k-mer content, all kinds of G-enriched k-mer peeks almost throughout the reads. The poly-G probably also shows up in the overrepresented sequences table. What causes poly-G? As shown in left figure below, HiSeq and MiSeq use four color method during the basecalling. Each color represents one base type. Once all 4 imaging