Development¶
Thanks for your interest in helping us develop the GA4GH reference implementation! There are lots of ways to contribute, and it’s easy to get up and running. This page should provide the basic information required to get started; if you encounter any difficulties please let us know
Warning
This guide is a work in progress, and is incomplete.
Development environment¶
We need a development Python 2.7 installation, Git, and some basic libraries. On Debian or Ubuntu, we can install these using
$ sudo apt-get install python-dev zlib1g-dev git
Note
TODO: Document this basic step for other platforms? We definitely want to tell people how to do this with Brew or ports on a Mac.
If you don’t have admin access to your machine, please contact your system administrator, and ask them to install the development version of Python 2.7 and the development headers for zlib.
Once these basic prerequisites are in place, we can then bootstrap our
local Python installation so that we have all of the packages we require
and we can keep them up to date. Because we use the functionality
of the recent versions of pip and other tools, it is important to
use our own version of it and not any older versions that may be
already on the system.
$ wget https://bootstrap.pypa.io/get-pip.py
$ python get-pip.py --user
This creates a user specific
site-packages installation for Python, which is based in your ~/.local
directory. This means that you can now install any Python packages you like
without needing to either bother your sysadmin or worry about breaking your
system Python installation. To use this, you need to add the newly installed
version of pip to your PATH. This can be done by adding something
like
export PATH=$HOME/.local/bin:$PATH
to your ~/.bashrc file. (This will be slightly different if you use
another shell like csh or zsh.)
We then need to activate this configuration by logging out, and logging back in. Then, test this by running:
$ pip --version
pip 6.0.8 from /home/username/.local/lib/python2.7/site-packages (python 2.7)
We are now ready to start developing!
GitHub workflow¶
First, go to https://github.com/ga4gh/server and click on the ‘Fork’ button in the top right-hand corner. This will allow you to create your own private fork of the server project where you can work. See the GitHub documentation for help on forking repositories. Once you have created your own fork on GitHub, you’ll need to clone a local copy of this repo. This might look something like:
$ git clone git@github.com:username/server.git
We can then install all of the packages that we need for developing the GA4GH reference server:
$ cd server
$ pip install -r requirements.txt --user
This will take a little time as the libraries that we require are fetched from PyPI and built.
It is also important to set up an upstream remote for your repo so that you can sync up with the changes that other people are making:
$ git remote add upstream https://github.com/ga4gh/server.git
TODO
- Walk a new developer through the process of fetching
upstream/masterand creating a new topic branch. - Walk them through pushing to their fork and making a pull request.
- Explain why we do a lot of rebasing, and walk them through the process of squashing commits.
Contributing¶
See the files CONTRIBUTING.md and STYLE.md for an overview of
the processes for contributing code and the style guidelines that we
use.
Development utilities¶
All of the command line interface utilities have local scripts
that simplify development: for example, we can run the local version of the
ga2sam program by using:
$ python ga2sam_dev.py
To run the server locally in development mode, we can use the server_dev.py
script, e.g.:
$ python server_dev.py
will run a server using the default configuration. This default configuration
expects a data hierarchy to exist in the ga4gh-example-data directory.
This default configuration can be changed by providing a (fully qualified)
path to a configuration file (see the Configuration
section for details).
Organisation¶
The code for the project is held in the ga4gh package, which corresponds to
the ga4gh directory in the project root. Within this package, the
functionality is split between the client, server, protocol and
cli modules. The cli module contains the definitions for the
ga4gh_client and ga4gh_server programs.