Welcome to CSAR -- A Resource for Docking and
2013 Benchmark Exercise - click here for information
2012 Datasets – Full Release click here
Computational chemists need reliable
experimental data. The
Community Structure-Activity Resource (CSAR) provides experimental datasets of
crystal structures and binding affinities for diverse protein-ligand complexes.
Some datasets will be generated in house at Michigan while others will be
collected from the literature or deposited by academic labs, national centers,
and the pharmaceutical industry.
We aim to provide the highest quality data
for a diverse collection of proteins and small molecule ligands. We need input
from the community in developing our target priorities. Ideal targets will have
many high-quality crystal structures (apo and 10-20 bound to diverse ligands)
and affinity data for ≥25 compounds that range in size, scaffold, and logP.
It is best if the ligand set has several congeneric series that span a broad
range of affinity, with low nanomolar to mid-micromolar being most desirable.
We prefer Kd data over Ki data over IC50 data (no % activity data). We will
determine solubility, pKa, logP/logD data for the ligands whenever possible. We
have augmented some donated IC50 data by determining Kon/Koff and ITC data.
CSAR is funded by a U01 grant from the
National Institute of General Medical Sciences. The original RFA can be found
Press releases about CSAR can be found at:
should my company donate proprietary data to the public domain? Computational
techniques are very successful at enriching hit rates when identifying sets of
compounds for experimental testing. However, it is not possible to reliably
rank nanomolar-level compounds over those with micromolar affinities. By
donating data, it outsources the development of better tools. Pharma has the
data, but not the time, to develop improved tools. Second, you have nothing to
lose because we are asking for “old” data. Abandoned projects have the kind of
data we need, and some could be donated without compromising a company’s competitive
advantage on current projects. Third, participation in CSAR can provide
visibility in the field. In particular, the donated data could be used to
conduct a community-wide blind evaluation of docking and scoring methods.
Lastly, there may be a possible financial benefit. Data has value, and it might
be possible for the company to declare a charitable donation (of course, this
requires consultation with the company’s legal and accounting teams). Our first
dataset has been contributed by Abbott (urokinase), and we have reached a legal
agreement with GSK to obtain data. We are working with scientists at BMS,
Vertex, Pfizer, Merck, Genentech, and Eli Lilly to identify possible
depositions. For the community to improve our approaches, we need
exceptional datasets to train scoring functions and develop new docking
algorithms. That is the goal of the CSAR project.