09.12.2022

"Free" tricks to make good enough software

Sunniva Indrehus
Scientific developer

width:90 height:40
09.12.2022

Do good enough[1]

width:90 height:40 [1] Wilson, Greg, et al. 'Good enough practices in scientific computing.' PLoS computational biology 13.6 (2017): e1005510.
09.12.2022

Do good enough[1]

A model of a horse
w:400 h:350
Credit: Ali Bati

width:90 height:40 [1] Wilson, Greg, et al. 'Good enough practices in scientific computing.' PLoS computational biology 13.6 (2017): e1005510.
09.12.2022

Do good enough[1]

A model of a horse
w:400 h:350
Credit: Ali Bati

Development speed vs. time
w:450 h:325
Credit: Radovan Bast

width:90 height:40 [1] Wilson, Greg, et al. 'Good enough practices in scientific computing.' PLoS computational biology 13.6 (2017): e1005510.
09.12.2022

What is good enough code?

w:550 h:450

width:90 height:40
09.12.2022

What is good enough code?

Some thoughts

  • Something that works
  • Standardized
  • Understandable
  • Reproducable[1]
  • Maintainable[2]

w:550 h:450

                     [1] Neural Information Processing Systems: The Machine Learning Reproducibility Checklist, retrieved: 11.29, 25th of Febrauary 2022.
width:90 height:40 [2] Common estimate: annually spending 20% of the development time for maintenance work
09.12.2022

How to write code that

Gives you ....

  • Something that works
  • Standardized
  • Understandable
  • Reproducable
  • Maintainable

What is a good enough code model?   ......?

w:550 h:450

width:90 height:40
09.12.2022

Know the rules well, so you can break them effectively - Dalai Lama XIV

width:90 height:40
09.12.2022

w:50 h:45 Python

  • High level programming language
  • Dynamically typed
    • Easy to use and misuse
  • Current version (7.12.22): 3.11.1

w:550 h:450
Figure credit: xkcd

width:90 height:40
09.12.2022
Python 3.11.0 (main, Nov 16 2022, 11:26:09) [GCC 9.3.0] on linux
>>> import this
The Zen of Python, by Tim Peters
Beautiful is better than ugly.
Explicit is better than implicit.
Simple is better than complex.
Complex is better than complicated.
Flat is better than nested.
Sparse is better than dense.
Readability counts.
Special cases aren't special enough to break the rules.
Although practicality beats purity.
Errors should never pass silently.
Unless explicitly silenced.
In the face of ambiguity, refuse the temptation to guess.
There should be one-- and preferably only one --obvious way to do it.
Although that way may not be obvious at first unless you're Dutch.
Now is better than never.
Although never is often better than *right* now.
If the implementation is hard to explain, it's a bad idea.
If the implementation is easy to explain, it may be a good idea.
Namespaces are one honking great idea -- let's do more of those!
width:90 height:40
09.12.2022

Our PEP best friends

PEP 484

  • Type annotation
    • Released 2015-09-13
  • Tool

PEP 8

A Python Enhancement Proposal (PEP) is a design document providing information to the Python community, or describing a new feature for Python or its processes or environment

width:90 height:40
09.12.2022

Example

n!=n(n1)(n2)...21=n(n1)!n! = n \cdot (n-1)\cdot (n-2)\cdot ... \cdot 2 \cdot 1 = n \cdot (n-1)!

# this file is named factorial.py
def factorial(n):
    if n < 2:
        return 1
    return n * factorial(n - 1)

print(f"value is: {factorial(4)}")
print(f"value is: {factorial('4')}")
(pre-commits-python-example-py3.11):~/good-scientific-software-for-free-python-demo(main)$ python factorial.py
value is: 24
Traceback (most recent call last):
  File "/home/sunnivin/NGI/slides/good-scientific-software-for-free/good-scientific-software-for-free-python-demo/factorial.py", line 14, in <module>
    print(f"value is: {factorial('4')}")
                       ^^^^^^^^^^^^^^
  File "/home/sunnivin/NGI/slides/good-scientific-software-for-free/good-scientific-software-for-free-python-demo/factorial.py", line 8, in factorial
    if n < 2:
       ^^^^^
TypeError: '<' not supported between instances of 'str' and 'int'
width:90 height:40
09.12.2022

Type hints

Update function to include type hints

# this file is named factorial.py
def factorial(n: int ) -> int:
    if n < 2:
        return 1
    return n * factorial(n - 1)


print(f"value is: {factorial(4)}")
print(f"value is: {factorial(4.5)}")
(pre-commits-python-example-py3.11):~/good-scientific-software-for-free-python-demo(main)$ python factorial.py
value is: 24
value is: 39.375
width:90 height:40
09.12.2022

Type hints

Was this pythonically correct?

(pre-commits-python-example-py3.11):~/good-scientific-software-for-free-python-demo(main)$ python factorial.py
value is: 24
😱 value is: 39.375 😱

Check with mypy

(pre-commits-python-example-py3.11):~/good-scientific-software-for-free-python-demo(main)$ mypy factorial.py
factorial.py:14: error: Argument 1 to "factorial" has incompatible type "float"; expected "int"  [arg-type]
Found 1 error in 1 file (checked 1 source file)

❗ What about data[1]?

                     
width:90 height:40 [1] Actually agreeable with Pydantic, Pandera, retrieved: 15.11, 30th of November 2022 😍
09.12.2022

Style guide

Indentation

Use 4 spaces per indentation level.

# Correct: aligned with opening delimiter.
foo = long_function_name(var_one, var_two,
                         var_three, var_four)

# Wrong: arguments on first line forbidden when not using vertical alignment.
foo = long_function_name(var_one, var_two,
    var_three, var_four)
width:90 height:40
09.12.2022

Style guide

Names to avoid

Never use the characters ‘l’ (lowercase letter el), ‘O’ (uppercase letter > oh), or ‘I’ (uppercase letter eye) as single character variable names.
In some fonts, these characters are indistinguishable from the numerals one and zero. When tempted to use ‘l’, use ‘L’ instead.

l : string = "my_string"
1
L : float = 4.5
width:90 height:40
09.12.2022

Style guide

The following code can run correctly, but is not correctly formatted according to PEP8

# this file is named factorial.py
from pathlib import Path
import math 

l = 4 
def factorial(
    n: int) -> int        :
    if n <          2:
        return 1
    return           n * factorial(n - 1)

print(f"value is: {factorial(l)}")
width:90 height:40
09.12.2022

isort

(pre-commits-python-example-py3.11):~/good-scientific-software-for-free-python-demo(main)$ isort factorial.py
import math
from pathlib import Path

l = 4 
def factorial(
    n: int) -> int        :
    if n <          2:
        return 1
    return           n * factorial(n - 1)

print(f"value is: {factorial(l)}")
width:90 height:40
09.12.2022

Flake8

(pre-commits-python-example-py3.11):~/good-scientific-software-for-free-python-demo(main)$ flake8 factorial.py
factorial.py:1:1: F401 'math' imported but unused
factorial.py:2:1: F401 'pathlib.Path' imported but unused
factorial.py:4:1: E741 ambiguous variable name 'l'
# this file is named factorial.py
my_variable = 4 
def factorial(
    n: int) -> int        :
    if n <          2:
        return 1
    return           n * factorial(n - 1)

print(f"value is: {factorial(my_variable)}")
width:90 height:40
09.12.2022

Black

(pre-commits-python-example-py3.11):~/good-scientific-software-for-free-python-demo(main)$ black factorial.py
reformatted factorial.py

All done! ✨ 🍰 ✨
1 file reformatted.
# this file is named factorial.py
my_variable = 4


def factorial(n: int) -> int:
    if n < 2:
        return 1
    return n * factorial(n - 1)


print(f"value is: {factorial(my_variable)}")
width:90 height:40
09.12.2022

Git-Hooks

  • Enforce standards locally

  • Action linked to a change

  • Pre-commit
    • .pre-commit-config.yaml
width:90 height:40
09.12.2022

Pre-commit

(pre-commits-python-example-py3.11):~/good-scientific-software-for-free-python-demo(main)$ ls -la .git/hooks/
... 
-rwxr-xr-x 1 sunnivin sunnivin 1638 Dec  8 13:47 pre-commit.sample
... 
(pre-commits-python-example-py3.11):~/good-scientific-software-for-free-python-demo(main)$ poetry install 
(pre-commits-python-example-py3.11):~/good-scientific-software-for-free-python-demo(main)$ pre-commit install
pre-commit installed at .git/hooks/pre-commit
(pre-commits-python-example-py3.11):~/good-scientific-software-for-free-python-demo(main)$ ls -la .git/hooks/
...
-rwxr-xr-x 1 sunnivin sunnivin  678 Dec  8 13:59 pre-commit
-rwxr-xr-x 1 sunnivin sunnivin 1638 Dec  8 13:47 pre-commit.sample
...
width:90 height:40
09.12.2022

Pre-commit

(pre-commits-python-example-py3.11):~/good-scientific-software-for-free-python-demo(main)$ git add factorial.py 
(pre-commits-python-example-py3.11):~/good-scientific-software-for-free-python-demo(main)$ git commit -m "doc: enforcing PEP8" 
...
Fixing factorial.py
Mixed line ending........................................................Passed
mypy.....................................................................Passed
seed isort known_third_party.............................................Passed
isort....................................................................Passed
black....................................................................Failed
- hook id: black
- files were modified by this hook
reformatted factorial.py
All done! ✨ 🍰 ✨
1 file reformatted.
pyupgrade................................................................Passed
blacken-docs.............................................................Passed
flake8...................................................................Failed
- hook id: flake8
- exit code: 1
factorial.py:1:1: F401 'math' imported but unused
factorial.py:2:1: F401 'pathlib.Path' imported but unused
factorial.py:4:1: E741 ambiguous variable name 'l'
...
width:90 height:40
09.12.2022

Pre-commit

My code is updated

# this file is called factorial.py
l = 4


def factorial(n: int) -> int:
    if n < 2:
        return 1
    return n * factorial(n - 1)


print(f"value is: {factorial(l)}")
width:90 height:40
09.12.2022

Pre-commit with automatic formatting

Demo repository

width:90 height:40
09.12.2022

🙏 Credit for this talk 🙏

width:90 height:40

Write some invisible text

## <span style="color:#F5F5F5">What is a good *enough* code model?</span>

A model of a horse ![w:400 h:350](figures/illustrations/horse.png) *Credit:* [Ali Bati](http://www.alibati.com/horse)

Development speed vs. time ![w:450 h:325](figures/illustrations/development_speed_quick_hacks.png) *Credit:* Radovan Bast

Development speed vs. time

![w:450 h:325](figures/illustrations/development_speed_quick_hacks.png) *Credit:* Radovan Bast

<!-- paginate: true

_footer: "![width:90 height:40](figures/logo/NGI/NGI_logo_transparent.gif) *Figure credit: [Ali Bati](http://www.alibati.com/horse)* " # What is a good *enough* scientific model? - Use something simplified to learn about the real world ![bg right w:400 h:350](figures/illustrations/horse.png) ---

# What is a good *enough* code? - Use something simplified **to write a simplified code** to learn something about the real world ![bg right w:400 h:350](figures/illustrations/horse.png) ---

## "free" quality code - Use type hints - (Try) to follow the Python Enhancement Proposals (PEPs)

<!-- _class: split-text-image

# The PEPs <div class=ldiv> A Python Enhancement Proposal [(PEP)](https://peps.python.org/pep-0000/) is a design document for python code ## Why should **you** care about the PEPs? <span style="color:#F5F5F5"> - *Code is read more often then it is written* - Guido Van Rossum - Standardization - Automation </span> </div> <div class=rdiv>

![w:400 h:425](figures/illustrations/dependency.png) *Image credit: [xkcd](https://xkcd.com/2347/)*

</div> ---

# The PEPs <div class=ldiv> A Python Enhancement Proposal [(PEP)](https://peps.python.org/pep-0000/) is a design document for python code ## Why should **you** care about the PEPs? - *Code is read more often then it is written* - Guido Van Rossum - Standardization - Automation </div> <div class=rdiv> ![w:400 h:425](figures/illustrations/dependency.png) *Image credit: [xkcd](https://xkcd.com/2347/)* </div> ---

<!-- _class: split-text-image

# The *zen* of scientific programming <div class=ldiv> ## ## ## - Find the right tool for the right job </div> <div class=rdiv> ## ![w:450 h:325](figures/illustrations/development_speed_quick_hacks.png) *Credit: Radovan Bast* </div>