Protein, Nucleotide Sequence Search

Finding identical, similar, homologous bio-sequences for you!

Biosequence search, encompassing protein sequence search and nucleotide sequence search, is a crucial tool in biotechnology and the pharmaceutical industry for identifying prior art that discloses protein, peptide, DNA, and RNA sequences identical or similar to a query sequence. This type of search is fundamental for ensuring that a given biological sequence is novel, supporting the development of new inventions across pharmaceutical, agricultural, and biotech industries.

The significance of biosequence search is rooted in historical achievements, such as the characterization of insulin protein sequences by Fred Sanger in 1951, a pioneering effort in the sequencing of long-strand molecules like DNA. This milestone underscored the importance of understanding biological sequences in advancing scientific knowledge and innovation.

Primary Uses of Biosequence Search:

Novelty Assessment: Determines whether a biosequence is unique, aiding in the patentability analysis for inventions.
Innovation Development: Facilitates the creation of new products and technologies in life sciences, especially within pharmaceutical, agricultural, and biotech sectors.

Major Databases for Biosequence Search:

NCBI-BLAST (Basic Local Alignment Search Tool): A tool provided by the National Center for Biotechnology Information for comparing an input sequence against a database of sequences.
EMBL-EBI-BLAST: Offered by the European Molecular Biology Laboratory's European Bioinformatics Institute, this tool also facilitates sequence comparison.
DDBJ (DNA Data Bank of Japan): A repository of DNA sequences, contributing to the global exchange of data.
GenBank: A comprehensive public database of nucleotide sequences and supporting bibliographic and biological annotation.
Patent Lens: A searchable database of patents, including those with biological sequences.
UNIPROT: A central repository of protein sequence and annotation data.
STN Express BLAST: A tool for searching chemical structures and sequences within scientific and patent literature.
GenomeQuest: Offers sequence data management and analysis tools.

These databases are integral to the storage, update, and retrieval of nucleic acid and protein sequences, utilizing advanced information technology and software systems for efficient data management.

Types of Sequence Alignment:

Pair-wise Sequence Alignment: Compares two sequences to identify regions of similarity that may indicate functional, structural, or evolutionary relationships. Tools for pair-wise alignment include BLAST and dot plots.
Multiple Sequence Alignment: Analyzes more than two sequences to identify alignments that reveal sequence homology, evolutionary relationships, or functional similarities. Key tools for multiple alignments include ClustalW, PROBCONS, MUSCLE, MAFFT, and T-Coffee.

Biosequence search and alignment are indispensable for the exploration and understanding of biological sequences, playing a critical role in the patenting process for biotechnological and pharmaceutical innovations. They enable researchers and inventors to navigate the complex landscape of biological data, ensuring that new inventions are both novel and non-obvious.