PostHeaderIconHelp

  1. Introduction - miRror Concept
  2. miRror Suite
  3. Databases
  4. miRror 2.0
  5. PSI-miRror
  6. miRrorNet
  7. Optional parameters
  8. ID conversion
  9. Validation assessment
  10. Technical remarks
  11. Citing mirror
  12. User Manual

1. Introduction – miRror Concept

miRror is a platform by which an experimentalist can gain biological insights on sets of molecular entities. Such sets are the product of large-scale experiments such as miRNA profiling, mass spectrometry proteomics and gene expression data. miRror operates at a dual mode where sets of (i) miRNAs, (ii) gene-targets / proteins are used as the input (Fig. 1).


Fig. 1.  Scheme of miRror protocol for a Gene SET as Input. miRror protocol operates also for miRNA SET as Input.



miRtegrate protocol is the core of miRror. It calculates the probability of matches between a set of miRNAs and the most likely set of gene that are targeted, based on number of miRNA-target prediction databases.

The statistical basis of miRror provides a common ground for all miRNA-Target prediction databases. We calculated the probability of the gene's interaction with the collection of miRNAs in the user set as opposed to the rest of the miRNAs in that database. The miRNA-targets coverage associated with each of the prediction databases is used for the calculation of a statistical significant threshold.

P-value for miRNAs as input is performed according to the hypergeometric distribution with a correction for multiple testing. The same principle is applied with genes as input.

Formally, N is the total number of miRNAs in the database, n is the number of miRNAs matches to the specific gene, k is the number of genes matches to miRNAs that appear in the user set, and m is the total number of miRNAs in the user set.

Therefore, for each gene, the probability of X matches with miRNAs in the user set is:

 


We applied a default value of p-value to be <0.05 as a threshold, but this and additional
parameters can be changed by the user.

2. miRror Suite


Fig. 2. miRor Suite is composed of 3 applications


miRror Suite covers miRNAs from Human, Mouse, Rat, C. elegans, Fly and Zebrafish. It composes from three interlinked analysis tools: (i)
miRror 2.0 (ii) PSI-miRror and (III) miRrorNet (Fig. 2).

3. Databases

The number of miRNAs and Targets that are covered by each of the Databases varies significantly. The number of miRNAs and Gene-targets that are successfully mapped by the ID convertor are listed (Table 2). These numbers are used to calculate the statistical significance by miRtegrate.

The user is encouraged to select any combinations of databases (at least 2 databases). The selection of databases eventually affects the presenting results. There are 3 pre-determined options for any of the supported organisms:

(i) Select all databases (as listed in Table 1). This option is suggested when the user wishes to get the most extensive results.

(ii) A recommended set includes the specific database from prediction databases that reports on more than database. For example, it includes database of PITA_TOP and exclude PITA_ALL.

(iii) A minimal set includes the databases that have the maximal coverage (in terms of genes an miRNAs). The minimal set is a subset of the recommended set and maybe used for a fast analysis.

The databases that are supported are associated with each selected organism. miRror 2.0 supports 15 miRNA-Target prediction databases that are based on different algorithms and different scoring methods. The availability of the databases according to the different organisms is summarized in Table 1.

The number of miRNAs and Targets that are covered by each of the Databases varies significantly. The number of Gene-targets that are successfully mapped by the ID convertor are listed (Table 2). The number of miRNAs that are are listed (Table 3).

 

These numbers are used to calculate the statistical significance by miRtegrate. The user is encouraged to select any combinations of databases (at least 2 databases). The selection of databases eventually affects the presenting results. There are 3 pre-determined options for any of the supported organisms: (i) Select all databases (as listed in Table 1). This option is suggested when the user wishes to get the most extensive results. (ii) A recommended set includes the specific database from prediction databases that reports on more than database. For example, it includes database of PITA_TOP and exclude PITA_ALL. (iii) A minimal set includes the databases that have the maximal coverage (in terms of genes an miRNAs). The minimal set is a subset of the recommended set and maybe used for a fast analysis.


Table 1. Database availability according to the selected organisms. Y -indicated that the database covers the indicated organism. Light blue background indicated that the scoring ranking method can be used as filters. See Optional parameters.


Database

Fly

Human

Mouse

Rat

Worm

Zebrafish

MAMI

 

Y

 

 

 

 

Map2

 

Y

Y

Y

Y

Y

mCosm

Y

Y

Y

Y

Y

Y

microT

 

Y

Y

 

 

 

miRanda

Y

Y

Y

Y

Y

 

miRDB

 

Y

Y

Y

 

 

mirZ

Y

Y

Y

Y

Y

Y

PicTar

Y

 

Y

 

Y

 

PicTar_4way

 

Y

 

 

 

 

PicTar_5way

 

Y

 

 

 

 

PITA_all

Y

Y

Y

 

Y

 

PITA_top

Y

Y

Y

 

Y

 

RNA22

Y

Y

Y

 

Y

 

TRank_all

 

Y

Y

 

 

 

TRank_con

 

Y

Y

 

 

 

TScan

Y

Y

Y

 

Y

 



4. miRror 2.0

miRror 2.0 applies the miRror concept in its advanced mode. Accordingly, the user can select the parameters to refine the analysis and apply a set of filters to refine the results.

All optional selections are shown in the scheme by a dark blue background box (Fig. 3).
miRror 2.0 can be operated in the default mode with preselected parameters. The user needs to provide a Gene set (or a miRNA set) and to select the desired organism.
Examples of miRNA set and Gene set can be uploaded from mouse example sets. In all cases, default parameters will be provided. We encourage the user to apply and change the parameters and filters to improve and refine the results. 

Table 2. Coverage on genes in all supported databases following conversion to RefSeq IDs.

Database

Fly

Human

Mouse

Rat

Worm

Zebrafish

MAMI

 

11916

 

 

 

 

Map2

 

10183

6749

3716

4684

1502

mCosm

10372

16933

18617

14295

16340

10909

microT

 

16886

17532

 

 

 

miRanda

9719

18110

16563

5914

8543

 

miRDB

 

18561

16789

9928

 

 

mirZ

8839

27501

21858

15222

3499

1411

PicTar

3623

 

6332

 

3044

 

PicTar_4way

 

8544

 

 

 

 

PicTar_5way

 

2973

 

 

 

 

PITA_all

7045

14395

15500

 

8808

 

PITA_top

2518

8293

8438

 

2171

 

RNA22

3271

11643

11192

 

1932

 

TRank_all

 

15456

12770

 

 

 

TRank_con

 

10653

8894

 

 

 

TScan

11415

11152

8930

 

15054

 



Table 3. Coverage on miRNAs in all supported databases following the conversion to RefSeq IDs.

Database

Fly

Human

Mouse

Rat

Worm

Zebrafish

MAMI

 

318

 

 

 

 

Map2

 

471

381

239

135

220

mCosm

94

712

569

699

137

220

microT

 

556

375

 

 

 

miRanda

117

250

239

193

104

 

miRDB

 

704

571

292

 

 

mirZ

142

712

549

327

121

191

PicTar

77

 

266

 

118

 

PicTar_4way

 

179

 

 

 

 

PicTar_5way

 

130

 

 

 

 

PITA_all

148

678

492

 

155

 

PITA_top

148

678

492

 

155

 

RNA22

79

314

234

 

115

 

TRank_all

 

556

462

 

 

 

TRank_con

 

128

128

 

 

 

TScan

120

675

490

 

104

 




Fig. 3. A scheme of miRror 2.0 protocol for a Gene SET as Input. In dark blue the parameters and filters that are available for the users. miRror 2.0 protocol operates also for miRNA SET as Input.


miRIS stands for miRror Internal Score. It balances between the two major constraints of miRror predictions: the minimal number of DBs that support the prediction and the minimal number of hits from the input set. The score ranges from 0 to 1. For example, an input of 20 miRNAs and a search using 8 DBs. Prediction of a gene by 10/20 input hits and 6/8 supporting DBs is calculated to a miRIS of 0.625. Recall that for each set of parameters a minimal miRIS can be calculated. miRIS is used as a filter that restrict the results rather than as a search parameter. 

The filtered list of the results can be forwarded to several annotation schemes including Reactome, DAVID, PANDORA and STRING (Fig. 4)



Fig. 4. miRror 2.0 results are forwarded to several annotation resources for a deeper analysis.

5. PSI-miRror

PSI-miRror (Probability Supported Iterative miRror) is an iterative tool that is based on miRror concept (Fig. 1). The concept is illustrated in Fig. 5.

The PSI-miRror iterative mode is applied to the input set (e.g., Input Genes1, see scheme) in order to retrieve a refined gene list. The output of PSI-miRror is a refined list (Input Genes2). The primary input may be miRNA set or a Gene set.

PSI-miRror advantage is in refining the input set. Specifically, along the process, Genes (or miRNAs) may be added or removed from the original set. Addition of removal of entities from the original list is based on an PSI-internal score of the PSI-miRror algorithm. Intuitively, the iterative protocol seeks an improvement in this score by testing the stability of the original list using miRror results from the first cycle.


Fig. 5  The concept of PSI-miRror. The first step is a standard miRror 2.0. The results are fed iteratively to refine the original input set. The output of PSI-miRror is illustrated by the Venn diagram.

A hypothetical example for a set of 20 genes with a selection for 2 iterations is described:

A set of 20 genes is used as an input (Gene1, see Fig. 4). A miRror 2.0 protocol is activated resulting in 10 miRNAs. In the first iteration, these 10 miRNAs are used as an input (Fig. 6). The results report on 18 Genes but only 15 of these genes were included in the original set (note that intermediate results are not reported by PSI-miRror). In the second iteration, these 18 genes are used as an input set and the results are 12 miRNAs. In the same iteration, the 12 miRNAs set are again used as input in PSI-miRror and the result indicated 22 genes, only 19 of them are including in the original set.  

We will provide these results as a Venn diagram indicating an overlap of 19/20 genes and addition of 3 genes that were not included in the original gene set. The results from PSI-miRror are graphically shown. A Venn diagram shows the overlap between the original input sets and the resulting set after application of the PSI-miRror iterative mode.

The same protocol is applied in PSI-miRror for miRNAs as original input.

Several benefits are associated with PSI-miRror:

1.     It can recover a gene that has a similar property in view of its miRNAs regulation but was not included in the original sets.

2.     miRNAs from the same families that were not included in a miRNA profiling experiment may be recovered by such protocol.

3.     It allows testing the coherence of a SET relative to a SET. In miRror 2.0 we test a SET relative to a single molecule (miRNA or Gene).

The PSI-miRror converges when the PSI-internal score is not changing by an additional iteration. PSI-miRror supports two modes of operations:

(1) Gene2Gene and miR2miR mode (Fig. 6).

    A protocol in which an input of Gene SET leads to a refined list of Gene SET.

    A protocol in which an input of miRNA SET leads to a refined list of miRNA SET.

(2) Gene2miR and miR2Gene mode (Fig. 7).

    A protocol in which an input of Gene SET leads to a list of miRNAs.

    A protocol in which an input of miRNA SET leads to a list of Genes.

Applying the same example described in Fig. 6 by applying the Gene2miR mode is illustrated in Fig. 7. The user may decide to retrieve the results without completing the last phase of the iterative cycle. In such a protocol, the last phase of the iteration is not executed (marked as PSI 1.5).

In this instance, an input of Gene SET will provide a list of miRNAs and symmetrically, an input of miRNA SET will provide a list of Genes.

In all cases, default parameters will be selected. We encourage the user to apply and change the parameters and filters to improve and refine the results.





Fig. 6.  Illustration for the Gene2Gene mode of operation of PSI-miRror. See text for details.



Fig. 7. Illustration for the Gene2miR mode of operation of PSI-miRror. See text for details. The last phase of the iteration cycle is not executed (marked as PSI 1.5).



6. miRroNet

miRrorNet tests the susceptibility of a small set of miRNAs to affect signaling pathways. The input to miRrorNet is a pathway from the curated database of KEGG for human and mouse.

The KEGG pathways that are supported by miRroNet include human diseases, organisomal systems, cellular processes, genetic and environmental information processing. Currently, a total of ~100 pathways for human and same number for mouse are included.  

miRror is applied to indicate a small set of preferred miRNAs that potentially alters the genes in the pathway graphs. miRrorNet is restricted to miRNA-Triplets (coined miR-Trios). The resulted miR-Trios are proposed to serve as a lead for a rational designed of miRNA-based cellular perturbation.


Fig. 8.  Partition of the best score achieved for 100 KEGG human pathways according to the different scoring methods. DIS=0 indicates pathways for which no miR-Trios were found that leads to disconnecting the pathway graph.


We apply 4 alternative scoring systems. Each scoring method is normalized and ranges from 0 to 1. The scores capture several aspects of pathway integrity.

FES - Fraction of the Edges Score. This score gives a higher score to miR-Trios that remove the largest fraction of edges.

BES - BEtweenneess Score. This score rewards miR-Trios that eliminate vertices with high centrality.

PAS - PArtition square Score. This score gives high scores to miR-Trios that break down the pathway into multiple smaller connected components.

DIS - Disconnection Score. It reports the best performing score out of any of the alternative ones (FES, BES and PAS) for a particular pathway.

The user can select to activate any of the scores or their combinations. Although a positive correlation was observed between them, the scoring methods are complementary and provide a different outcome for each pathway. The partition of the scoring methods that provide the best score for all 100 human KEGG pathways is shown (Fig. 8). For about 25% of the pathways (DIS=0), there is no miR-Trio that successfully leads to disconnecting the pathway graph.

The logic and the sequential steps of miRrorNet are summarized in the Fig. 9.  

(A) Graph representation of a regulatory or signaling pathway.

(B) miRror application for the input of all genes (G1-G6) in the pathway resulting in a minimal set of miRNAs (M1-M8).

(C) miRNAs produce a collection of all possible miR-Trios.

(D) Each gene targeted by the miR-Trio is eliminated from the graph. The effect on graph connectivity is measured by alternative scoring methods.

(E) The collection of miR-Trios is tested for the maximal disconnection score that specifies the most effective miR-Trios.

The output of miRrorNet is a small set of miR-Trios that were scored maximal for their ability to disrupt the integrity of the specific pathway.





Fig. 9. The logic and the sequential steps of miRrorNet

miRNAs that are shared by other miR-Trios are indicated. They are ranked according to the number of the occurrence within the top scored miR-Trios for that pathway



7. Optional parameters

For miRror 2.0 additional parameters were included to improve the analysis and to allow the user to focus on the preferred analysis.

a. Tissues and cell line expression

Gene expressed in tissues and cell-lines are from the GNF based on GSE1133 (human) and GSE10246 (mouse).

Human tissues are tested for 79 human tissue types and 61 tissues for mouse. In addition, NCI60 provide expression profile for most human transformed cell lines.

b. Gene expression levels

The user can select to perform the analysis on all genes or on genes that are expressed above a threshold (raw data, intensity 10).

c. Predictors scoring schemes

The user can select to use all database predictions or only those assigned to top 10%, 25% or 50%. As the scoring method for each database in different and some of the resources do not provide a numeric scoring. The distribution of the scores for each miRNA was computed to provide a percentile threshold. The databases that are supported by scoring are indicated in Table 1.

For PSI-miRror 2.0 all parameters of miRror 2.0 are available. In addition, the user can select:

a. Number of iterations: A default is set to 3 iterations.

b. Mode of operation: miR2miR / Gene2Gene or alternatively miR2Gene / Gene2miR. A default is set to miR2miR / Gene2Gene mode.

For miRorNet, the optional parameters are restricted to the same basic ‘advanced’ parameters of miRror (e.g., P-value threshold, number of databases, number of hits from the original input SET and the combination of these free parameters). In addition:

a. Scoring methods. BES is selected as a default scoring system. In addition, the user can select to mark other method. By selecting DIS, all three methods will be activated (FES, BES, PAS) and the best result will be reported.

b. Email reply. Some of the analysis may take few minutes, especially when large pathways are considered and all scoring methods are activated. We provide a link that will retain the results for 48 hrs. The user will be notified by email when the results are ready.

 

8.  ID conversion

miRror combines 15 resources that are based on different gene identifiers. RefSeq identifier is the central entry.

miRror Suite supports UniProtKB accession and ID, Official gene symbol, Flybase accession (for Drosophila) and Ensembl ID. All together, there are 15 identifiers that are supported and inter-connected between the different resources.

The match of RefSeq to the original accession of any of the resources reaches 97-98% for human and mouse and only 73% for the Drosophila. Currently, the gene isoforms are not explicitly supported.

 

9. Validation assessment

Validated miRNA-target pairs are archived in TarBase and miRecords resources. On the output table of miRror 2.0 and PSI-miRror, the experimentally validated miRNAs are highlighted. We have not included these resources for miRror analysis. Instead, a post analysis validation is activated. The coverage of these resources is also listed in Table 2 and Table 3.

 

10. Technical remarks

The miRror Suite will be updated twice a year.

Several expansion for miRror Suite include:

1. Adding pathways from Reactome and other high quality pathway resources.

2. Personal archive of results. miRrorNet analysis may take few minutes to complete. We will include a server for providing users access to their previous results.

3. We present a ‘history’ option that covers all inputs and results from the same session.

 

11. Citing miRror

The method behind miRror is presented in:

Friedman Y, Naamati G, Linial M. (2010) MiRror: a combinatorial analysis web tool for ensembles of microRNAs and their targets. Bioinformatics. 26:1920-1921.

 

12. User Manual
       
see tutorial



Please send comments and feedback.