The UniLectin project gathers structural information on lectins, which are glycan-binding proteins from all origins, along with their interactions with carbohydrate ligands. Among the proteins that interact non-covalently with carbohydrates, lectins bind mono- and oligosaccharides reversibly and specifically while displaying no catalytic or immunological activity. Lectins are oligomeric proteins that can specifically recognize carbohydrates, which as per present knowledge act as macromolecular tools to decipher sugar-encoded messages. Those complex carbohydrates (also referred to as glycans), both in the form of single molecules, or bound to proteins and lipids, are the most abundant class of biomolecules. They are increasingly being implicated in human health and environmental issues. Complex carbohydrates are built for high-density bio-coding, which is at par with proteins and nucleic acids, The information carried by glycans is encoded by their 3D-structure and sometimes by their dynamics. Consequently, the structural characterization of lectin interactions, as a reader of the glyco-code is essential.
Lectins are still poorly classified and annotated, and since their functions are based on recognition, we use their 3D-structures as the foundation of project. UniLectin is a curated database with a classification proposed on origin and fold with association of literature and functional data such as known specificity. The content of UniLectin is centered on 3-dimensional data, using PDB information, with an appropriate curation of the glycan topology. It provides a family-based classification and cross-links to specialized glyco-related databases. In particular, each carbohydrate ligand can be seen (upon one click) as part of the full carbohydrate structures. Finally, the 3D visualization of contacts between the lectin and the ligand, is visualized via the Protein-Ligand Interaction Profiler (PLIP) server. The introduction of such a feature is likely to meet the expectations of lectin specialists. The UniLectin contains 1740 lectin structures that have been manually curated; this corresponds to 426 different lectins (as of 2018-08-01). Bibliographic entries cover the 763 published articles describing at least one structure. The first classification level, referred to as Origins, separates the lectins into seven different classes, which correspond to the main domains of the living kingdom. The second level orders the lectins according to the protein fold into 75 classes. The third level separates the lectins according to their species in 309 families.
Among the 1740 3D structures, 1085 occur as complexed with glycans. The most commonly observed monosaccharides are as follows: Galactose (Gal) 33%(576); N-Acetyl glucosamine (GlcNAc) 16%(288), Glucose (Glc) 15%(260), Mannose (Man) 14%, Fucose (Fuc) 10%(169), N-Acetyl galactosamine (GalNAc) 8% (141), sialic acid (Neu5Ac) (131) 7%., but rarer sugars are also observed in complexes with lectin (Rhamnose, Arabinose …). The ligands occur as monosaccharides, but also as oligosaccharides or glycoconjugates. The set of distinct glycan ligands amounts to 362.
The lectins categories are displayed in both a sunburst and a tree to facilitate the exploration. The research interface provides multiples criteria to select specific lectins or structures. For every structure multiple manually curated information are available, mostly about the associate carbohydrate.
A simple search window is available in the UniLectin home page. The database is searched by entering keywords, PDB or UniProt accession numbers, fragments of glycan sequences, or textual fragments of the title of a publication.
On the lectin category sunburst, the multiples PDB structures available in each order class and families can be explored. The taxonomic tree also allws to explore the classification with the tree leafs which can be clicked to expand the tree and access subcategories. Click on the search ison to open the category in a new page.
The browser interface allows to select lectins category to explore their structures.
The advanced search offers selection of criteria. Lectin can be searched by structure or by sequence family with the support of drop-down lists. The classification of lectins (Origin, Class, Family) also provides several search criteria. Other criteria pertain to the nature of the fold and taxonomic details of the lectin. Keywords from the title in a reference article can also be searched. A unique feature is the search of fragments of glycan ligands that will output lectins interacting with a given carbohydrate. Finally, a cutoff on the resolution (Å) of the X-ray structure can be used as a filter for selecting of high-quality data. UniLectin allows precise taxonomic search for all lectins that have been structurally characterized in a given organism. Another key feature is the option to search lectin structures with oligosaccharide motifs that are complexed within the binding sites.
The lectins can be explored by protein uniprot ID and the multiple PDB structures available are displayed in the results.
The lectins can be explored by struture. Each structure is related to a protein with an ID uniprot. For each structure the other structure from the same protein are proposed in the results.
Information can be obtained for the protein partner, the glycan partner, and details of their interactions using the Protein-Ligand Interaction Profiler (PLIP) server. Special care was devoted to the description of the bound glycan ligands with the use of simple graphical representation and numerical format for cross-linking to other databases in glycoscience. We conceive the architecture and navigation tools to extend the search to all organisms, as well as to search for all oligosaccharide epitopes complexed within specified binding sites.
PLIP allows to visualize in details the interactions on the 3D structure, specific interactions types, and the surounding residues
The PDBe functionaol domain viewer provides information on the glycosilated sites, functional domain, carbohydrates binding sites