Network analysis
This notebook shows examples of how to perform network analysis on TF-COMB co-occurrence results.
Load CombObj containing rules
First, we will load the data as created in the “select rules”-notebook.
[1]:
import tfcomb
from tfcomb import CombObj
C = CombObj().from_pickle("../data/GM12878_selected.pkl")
[2]:
C
[2]:
<CombObj: 83705 TFBS (86 unique names) | Market basket analysis: 282 rules>
Visualize network
We can now use the CombObj to directly visualize the network. The first time the function is run, the .network attribute will be created within the CombObj:
[3]:
C.plot_network()
WARNING: The .network attribute is not set yet - running build_network().
INFO: Finished! The network is found within <CombObj>.network.
[3]:
You can also change the colors of the nodes and edges via edge_cmap
and node_cmap
:
[4]:
C.plot_network(edge_cmap="Blues", node_cmap="viridis")
[4]:
By default, the nodes are colored by count of binding sites, and edges are colored by the ‘cosine’ score. However, this is easily adjusted via the ‘color__by’ parameters:
[5]:
C.plot_network(color_node_by="TF1", color_edge_by="zscore")
[5]:
Different network layouts
It is possible to change the network layout via the “engine” parameter of plot_network:
[6]:
C.plot_network(engine="circo", legend_size=70)
[6]:
[7]:
C.plot_network(engine="fdp", size_node_by="TF1_count")
[7]:
[8]:
C.plot_network(engine="dot")
[8]:
[9]:
#The available layouts are:
import graphviz; graphviz.ENGINES
[9]:
{'circo', 'dot', 'fdp', 'neato', 'osage', 'patchwork', 'sfdp', 'twopi'}
Cluster network nodes
It is often of interest to partition individual TFs into groups of highly connected TFs (highly co-occurring). This is possible using the cluster_network() function. Below we show different settings for performing TF clustering based on co-occurring networks.
Using louvain clustering (default)
The default partition_network uses louvain clustering of the python-louvain package (https://github.com/taynaud/python-louvain):
[10]:
C.cluster_network()
INFO: Added 'cluster' attribute to the network attributes
The partition was added to the network attributes as well as to the internal .TF_table of the CombObj:
[11]:
C.TF_table
[11]:
TF1 | TF1_count | TF2 | TF2_count | cluster | |
---|---|---|---|---|---|
CTCF | CTCF | 2432 | CTCF | 2432 | 1 |
RAD21 | RAD21 | 2241 | RAD21 | 2241 | 1 |
SMC3 | SMC3 | 1638 | SMC3 | 1638 | 1 |
IKZF1 | IKZF1 | 2922 | IKZF1 | 2922 | 2 |
IKZF2 | IKZF2 | 2324 | IKZF2 | 2324 | 2 |
... | ... | ... | ... | ... | ... |
ELK1 | ELK1 | 237 | ELK1 | 237 | 9 |
CEBPB | CEBPB | 273 | CEBPB | 273 | 6 |
RCOR1 | RCOR1 | 415 | RCOR1 | 415 | 6 |
TCF7 | TCF7 | 512 | TCF7 | 512 | 6 |
CBX5 | CBX5 | 887 | CBX5 | 887 | 6 |
86 rows × 5 columns
This enables plotting of the partition using plot_network:
[12]:
C.plot_network(color_node_by="cluster")
[12]:
Louvain clustering with weights
Louvain clustering can also work with weighted edges, as seen here for cosine:
[13]:
C.cluster_network(weight="cosine")
INFO: Added 'cluster' attribute to the network attributes
[14]:
C.plot_network(color_node_by="cluster")
[14]:
[15]:
#It is also possible to plot the network without label by overwriting labels via node_attributes:
C.plot_network(color_node_by="cluster", legend_size=0, node_attributes={"label": ""})
[15]:
Using block model
The third option is using the graph-tool ‘minimize.minimize_blockmodel_dl’-function (https://graph-tool.skewed.de/static/doc/inference.html#graph_tool.inference.minimize.minimize_blockmodel_dl):
[16]:
C.cluster_network(method="blockmodel")
[17]:
C.plot_network(color_node_by="cluster")
[17]: