Project creation

First you have to create the project into one of the gitlab groups depending on your projects. You have to edit the wiki to add your project in the list.

Project folders

Code

The functions must be put into subfolders inside the src folder. The launch script or notebook must be put at the root of the project.

Data

No data should be placed inside the git folder. The datas are shared between user and must be put inside the /home/data/ folder (for host) or /data/ inside container. It must have the same path than the code: /home/data/<group-gitlab>/<project-gitlab> Most of the time the processing will have this steps: raw data from client ==> some processing generating intermediate files ==> final files used by analysis ==> statistics files / powerpoints This data folder structure permits to respect that:

.
├── inputs        # All inputs used
│   ├── interim     # intermediate files: can be deleted
│   ├── processed   # final files: used by analysis
│   └── raw         # original files sent by the client or downloaded
└── outputs       # All outputs send to client

Add basic files

.gitignore

File to hide some system files from git. A template one is available here for Python.

Makefile

A file to centralize all commands to launch.

clean:
    rm -rf dist/
    rm -rf build/
    rm -rf *.egg-info
    find . -name '*.pyc' -exec rm -f {} +
    find . -name '*.pyo' -exec rm -f {} +
    find . -name '*~' -exec rm -f {} +
    find . -name '__pycache__' -exec rm -fr {} +
    find . -name 'spark-warehouse' -exec rm -fr {} +
    rm -fr .tox/
    rm -f .coverage
    rm -fr htmlcov/
    rm -rf .pytest_cache/

flake8: clean
    flake8 --max-line-length=100

.gitlab-ci.yml

A file to define some task to be launch by Gitlab integrated CI/CD.

Here is an exemple to deploy the code into the server and run a python linter at each commit :

stages:
  - test
  - deploy

default:
  image: alpine
  before_script:
    - apk add make
    - mkdir -p ~/.ssh && chmod 700 ~/.ssh
    - echo "$SSH_PMP1_PRODUCTION_KNOWN_HOST" >> ~/.ssh/known_hosts
    - chmod 644 ~/.ssh/known_hosts
    - "which ssh-agent || ( apk add --update openssh )"
    - eval $(ssh-agent -s)

flake8:
  image: cwaysdockerhub/vsc
  stage: test
  before_script:
    - make clean
  script:
    - make flake8

deploy_main_pmp1:
  stage: deploy
  script:
    - echo "$SSH_PMP1_PRODUCTION_PRIVATE_KEY" | tr -d '\r' | ssh-add - > /dev/null
    - ssh pmp-production@pmp1.pmplab.io "cd /home/pmp-production/<gitlab-group>/<gitlab-project>/
      && git fetch
      && git checkout main
      && git pull
      && exit"
  only:
    - main

README.md

The idea is to have the smallest README file to minimize the work to be up-to-date. Here is a proposition of section to fill:

# <Project name>

Short description: one sentence.

## Components

1. List of components

## Usage

Description of usage for each of the components (for the final user).
What are the commands to launch ?

## Installation

If needed, steps to install and configure the application for the end user.

## Development

Here are the description for devs to improve / update the app.

### Code folder architecture

Result of the command tree -d -L <max_level> with a small description for each file.

### Descriptions of action to create a new thing if needed

Project creation

Project folders​

Code​

Data​

Add basic files​

.gitignore​

Makefile​

.gitlab-ci.yml​

README.md​