Python Starter – Course

Yoan Mollard

CHAPTER 1

  1. CHARACTERISTICS AND SYNTAX OF PYTHON
    1.1. Characteristics
    1.2. Python types and typing
    1.3. Control structures
    1.4. Exceptions
    1.5. Import and use installed libraries

CHAPTER 2

  1. MODULES, PACKAGES AND LIBRARIES
    2.1. Structure of Python packages
    2.2. The Python Package Index (PyPI.org)
    2.3. Virtual environments (venv)
    2.4. Testing
    2.5. Object-Oriented Programming (O.O.P.)

CHAPTER 3

  1. GO FURTHER: THE PYTHON ECOSYSTEM
    3.1. Popular built-in libraries (coming with all Python distribs)
    3.2. Popular non-builtin libraries
    3.3. Other non-builtin libraries for code quality
    3.4. The Python Enhancement Proposals (PEP) and the PEP 8
    3.5. Python versions
  2. Charset and encoding

List of mini-projects

  1. The basics in JupyterLab: practice lists and dictionaries
  2. Hanged man: practice Python scripting, in PyCharm
  3. Money transfer simulator: develop, test and distribute a complete package
  4. Address book: manipulate nested data collections, and argparse, re, json
  5. Choose a mini-project: (ascending difficulty)
    5A. Guess my number: practice basic Python syntax
    5B. Estimate π: practice more complex computations, and random
    5C. Create a micro web app: practice flask
    5D. Communicate with a REST API: practice requests
    5E. Webscraping: practice beautifulsoup4
    5F. Plot ping durations: practice subprocess and matplotlib

CHAPTER 1

CHARACTERISTICS AND SYNTAX OF PYTHON

Python is an Interpreted multi-paradigm programming language from 1991* for:

  • Data science:
    • Data analysis (big data), dataviz, extrapolation, trends…
    • Machine Learning: Torch, Tensorflow, Theano …
  • Web applications: backends and servers
  • System administration, scripting, automation
  • Business applications: fat clients for the information system

* It's older than Java!

Pros

  • Concise and readable language, no superfluous
  • Light syntax to focus on the content, not the form
  • Fast prototyping and development
  • Automatic memory management
  • A lot of libraries available in the PyPI index (store) + alternatives
  • Community-centric: Documented on docs.python.org + other community resources
  • Multiplatform: Windows, UNIX, BSD, MacOS … even microcontrollers with 16k of RAM!
  • Interoperable: Bindings to C++/Java …

Cons

  • Slightly slower than compiled languages
  • Only server-side, not suitable for frontends or mobiles
  • Dynamic typing of variables (duck typing):
    • Favors a high memory usage
    • Favors production errors (runtime exceptions):
      • e.g. add an integer with a string
      • e.g. use an undeclared variable

It is easy to make robust Python applications with good practices: e.g. explicit typing and testing.

Characteristics

  • Interpreted language

  • Multiparadigm: object, imperative and functional

  • Relies on Duck typing 🦆

  • Indentation is part of the syntax

  • Naming conventions:

    • snake_case() for variable names, function names, and file names
    • CamelCase for class names
  • Parameters passing by reference, except primitive types

The Python interpreter

Python files use extension .py, e.g.:

# my_program.py 
print("Hello world")

You pass your Python code to a Python interpreter installed on your system, e.g. C:\Python\python.exe or /usr/bin/python ...

/usr/bin/python mycode.py

The interpreter:

  • Compiles it in bytecode on-the-fly
  • Runs the bytecode

The interpreter produces bytecode in files with .pyc and .pyo extensions and in __pycache__ folders.

As a developer, you can ignore them, the interpreter handles compiling by itself.

Several implementations of the interpreter itself exist: CPython (The most popular implementation, in C), Jython (in Java), Pypy (in Python).

The REPL

The REPL aka Python console or Interactive Python:

  • Read the user input: a Python instruction
  • Evaluate it
  • Print the result, if any
  • Loop back to the beginning and read again
user@computer:~ $ python
Python 3.12.3 (main, Jul 31 2024, 17:43:48) [GCC 13.2.0] on linux

Type "help", "copyright", "credits" or "license" for more information.

>>> print("There are", 5 + 5, "apples in the basket")
There are 10 apples in the basket

>>> i = 6 + 6

Programming paradigms

Python is multi-paradigm:

  • Imperative: instructions create state changes
  • Object-oriented: instructions are grouped with their data in objects/classes
  • Functional: instructions are math function evaluations

All 3 paradigms are popular in the Python community, and often mixed all together.

Python types and typing

Python typing is dynamic. Type is inferred from the value ➡️ Runtime type

Runtime type of v can be introspected with type(v)

But pythonistas rely on 🦆 duck typing: To state about the suitability of an obj object, the runtime type type(obj) is less important than the methods it declares

Example: As soon as method __iter__ exist in class C, then C is considered an iterable, no matters what type(C) returns.

Primitive types

i = 9999999999999999999999999                   # int (unbound)
f = 1.0                                         # float
b = True                                        # bool
n = None                                        # NoneType (NULL)

🚨 Beware with floats

Python floats are IEEE754 floats with mathematically incorrect rounding precision:

0.1 + 0.1 + 0.1 - 0.3 == 0    # This is False 😿
print(0.1 + 0.1 + 0.1 - 0.3)  # Returns 5.551115123125783e-17 but not 0

Also, they are not able to handle large differences of precisions:

1e-10 + 1e10 == 1e10          # This is True 😿

When you deal with float number and if precision counts, use the decimal module!

from decimal import Decimal
Decimal("1e-10") + Decimal("1e10") == Decimal("1e10")   # This is False 🎉

Beware not to initialize Decimal with float since the precision is already lost: Decimal(0.1) will show Decimal('0.10000000000000000555111512312578270215')

The tuple

The tuple is the Python type for an ordered sequence of elements (an array).

t = (42, -15, None, 5.0)
t2 = True, True, 42.5
t3 = (1, (2, 3), 4, (4, 5))
t4 = 1,

Selection of an element uses the [ ] operator with an integer index starting at 0:

element = t[0]  # Returns the 0th element from t 

Tuple can be unpacked:

a, b = b, a   # Value swapping that unpacks tuple (b, a) into a and b

The string

The str type is an ordered sequence of characters.

s = "A string"
s2 = 'A string'           # Simple or double quotes make no difference
s3 = s + ' ' + s2         # Concatenation builds and returns a new string
letter = s2[0]            # Element access with an integer index

Tuples and strings are immutable. Definition: An object is said immutable when its value canot be updated after the initial assignment. The opposite is mutable.

Demonstration: put the first letter of these sequences in lower case:

s = "This does not work"
s[0] = "t"
# TypeError: 'str' object does not support item assignment

The list

A list is a mutable sequence of objects using integer indexes:

l = ["List example", 42, ["another", "list"], True, ("a", "tuple")]

element = l[0]             # Access item at index 0
l[0] = "Another example"   # Item assignment works because the list is mutable

some_slice = l[1:3]  # Return a sliced copy of l between indexes 1 (inc.) & 3 (ex.)

42 in l    # Evaluates to True if integer 42 is present in l

l.append("element") # Append at the end (right side)
element = l.pop()             # Remove from the end.

If needed, pop(i) and insert(value, i) operate at index i, but...

... ⚠️ list is fast to operate only at the right side!

Need a left-and-right efficient collection? Use 🐍 deque or 🐍 compare efficiency

The dictionary

The dictionary is a key-value pair container, mutable and ordered. Keys are unique.

d = {"key1": "value1", "key2": 42, 1: True} 
# Many types are accepted as keys or values, even mixed together

"key1" in d   # Evaluates to True if "key1" is a key in d
# Operator "in" always and only operates on keys

d["key2"]    # Access the value associated to a key

d.keys()     # dict_keys ["key1", "key2", 1]

d.values()   # dict_values ["value1", 42, True]

d["key3"] = "some other value"   # Insertion of a new pair

d.update({"key4": "foo", "key1": "bar"})

With Python 3.7 and below, dictionaries are unordered (see OrderedDict if needed)

Type hints

i: int = 42
l: list = ["a", "b", "c"]
l: list[str] = ["a", "b", "c"]
d: dict[str, dict[str, list[str]]] = {"emails": {"Josh": ["josh@example.org", "contact@josh.me"]}}

Typing is only intended for the type checker (e.g. mypy, or Pycharm).
At runtime, type hints are ignored by the interpreter but your IDE will report the type mismatch between the variable and the literal:

i: int = 4.2

The typing module contains tools for advanced typing mechanisms:

from typing import Optional, Union

number: Union[float, int, complex] = 4.2
value: Optional[list] = None

Parameter passing

What happens during an assignment?

list1 = [10, 20, 30]

list2 = list1     # Assignment by reference

Python passes parameters:

  • by value for primitives types
    (int, float, bool, tuple...)
  • by reference for all other objects

When you need a copy of the original:

list2 = list1.copy()

Control structures

The for loop

for i in range(10):
    print("element number", i)

Equivalent to the following loop in C:

for(int i=0; i<10; ++i)

The general form is:

for i in range(start_index, end_index_excluded, step):
    body

Equivalent to the following loop in C:

for(int i=start_index; i<end_index_excluded; i+=step) 

When the iterator is not needed, the underscore is used as a variable name so that semantic checkers do not warn about a variable being declared but unused:

for _ in range(10):
    _, result, _ = function()

The for loop can iterate any kind of values, e.g.:

for item in ["starter", "dish", "dessert"]:
    print("Please eat your", item)

When both the index and the value are needed, use enumerate():

for index, item in enumerate(["starter", "dish", "dessert"]):
    print(item, "is at index", index)

# starter is at index 0
# dish is at index 1
# dessert is at index 2

enumerate() returns a list of tuples, that are unpacked within the for loop:

[(0, "starter"), (1, "dish"), (2, "dessert")]

Iterating over a dictionary iterates on keys:

menu = {"starter": "salad", "dish": "chicken", "dessert": "brownie"}
for key in menu: 
    print("Your", key, "is a", menu[key])

# Your starter is a salad
# Your dish is a chicken
# Your dessert is a brownie

When both the key and the value are needed, use items():

menu = {"starter": "salad", "dish": "chicken", "dessert": "brownie"}
for key, value in menu.items(): 
    print("Your", key, "is a", value)

The while loop

i = 100

while i > 0:
    print(i)
    i -= 1

There is no do ... while structure.

There is no ++ or -- operator for increments.

Definition: Python calls Iterable any type that can be iterated over with a for or while loop. Duck typing checks that __iter__ exists to tell if it is an iterable.

Use logical operators and, or, not in order to build logical expressions:


# For a looking for the first even integer whose square is > 1000
# While i is odd or its square is not > 1000

i = 0

while i%2 == 0 or not i*i > 1000:
    i += 1 

print("The searched integer is", i)   # The answer is 33

The break statement

The break statement exits the for or while loop:

# The first even integer whose square is > 1000
i = 0

while True:      # Infinite loop
   if i%2 == 1 and i*i > 1000:
      break      # immediately quits the loop
   i += 1

print(i, "is the searched number")

The continue statement

The continue statement aborts the current loop and resumes at the next value:

integers = [-5, 10, 42, -9, 54, 1, -1, 55, -8, -84, 12]

# We want to call process(i) only on positive integers from the above list
for i in integers:
    if i <0:
        continue  # skips i since it's negative and jumps to the next i

    print(i, "is positive and must be processed")
    process(i)

The if statement

if mark < 8:
    print("You failed the exam")
elif mark < 10:         # Conditions are evaluated from top to bottom
    print("You must retake the exam")
else:
    print("You passed the exam")

When needed, use logical operators and, or, not, and/or nested blocks, e.g.:

if some_statement and some_other_statement:
    print("A")
    if yet_another_statement:
        print("B")
    else:
        print("C")
else:
    print("D")

Ternary expressions

want_float = True
val = 42.0 if want_float else 42      # Equivalent to the ? ... : in C

The match ... case statement

value = 10
match value:                 # Equivalent to the `switch() case:` in C
    case 10:
        print("TEN")
    case 20:
        print("TWENTY")
    case _:
        print("OTHER VALUE")

Only in Python 3.10 and above.

Function definition

def my_custom_sum(a: int, b: int) -> int:
    """
    Computes the sum of 2 floats or integers
    :param a: the first element to sum
    :param b: the second element to sum
    :return: the sum of a and b
    """
    return a + b

💡 Good practice: Add docstrings to your functions by using """.

After typing """ Your IDE may autocomplete the docstring with a sketch.

The expected format of the docstring is reStructuredText but other formats exist.

Docstrings can also be used to document a variable, a class, a whole file...

One can return 2 values or more:

def compute(a, b) -> tuple:
   return a + b, a - b, a * b, a / b
result = compute(4, 6)

Call to compute() returns a tuple:

results: tuple = compute(4,6)
quotient = results[3]

This tuple can also be unpacked:

sum, difference, product, quotient = compute(4, 6)

The star * is the flag that means 0 or n values. They are received in a list:

def compute(*args):
    sum, difference, product, quotient = 0, 0, 1, 1
    for value in args:   # args is a list
        sum += value
        difference -= value
        product *= value
        quotient /= value
    return sum, difference, product, quotient

sum, *other_results = compute(42, 50, 26, 10, 15)

A named parameter is passed to a function via its name instead of its position:

def sentence(apples=1, oranges=10):
   return f"He robbed {apples} apples and {oranges} oranges"

p = sentence(2, 5)
p = sentence()
p = sentence(oranges=2) 

The double star ** in the flag that means 0 or n named parameters. They are received as a dictionary:

def sentence(**kwargs):
    for item, quantity in kwargs.items():  # kwargs is a dict
        print(f"He robbed {quantity} {item}")

sentence(apples=2, oranges=5)
# He robbed 2 apples
# He robbed 5 oranges

Built-in functions

  • len(): Get the length (number of elements) of a container
  • min(), max(): Get the minimum or maximum values in a container
  • sum(): Get the sum of all elements in the container
  • round(): Round a float to a specific number of digits
  • zip(): Couple elements from several lists 2-by-2 (like a coat zipper)
  • int(), float(), list(), dict()...: Convert the parameter (cast)

Comprehensions

A comprehension is an inline notation to build a new sequence (list, dict, set).
Here is a list-comprehension:

l = [i*i for i in range(10)]  # Select i*i for each i in the original "range" sequence
# Returns [0, 1, 4, 9, 16, 25, 36, 49, 64, 81]

You may optionally filter the selected values with an if statement:

l = [i*i for i in range(100) if i*i % 10 == 0]  # Select values that are multiple of 10
# Returns [0, 100, 400, 900, 1600, 2500, 3600, 4900, 6400, 8100]

l = [(t, 2*t, 3*t) for t in range(5)] # Here we select tuples of integers:
# Returns [(0, 0, 0), (1, 2, 3), (2, 4, 6), (3, 6, 9), (4, 8, 12)]

Dict-comprehensions also work:

d = {x: x*x for x in range(10)}
# Returns {0: 0, 1: 1, 2: 4, 3: 9, 4: 16, 5: 25, 6: 36, 7: 49, 8: 64, 9: 81}

Exceptions

An exception is an error.

The mechanism of exceptions allows to trigger, propagate and fix errors according to a specific mechanism.

An exception can be:

  • raised with the raise keyword: it is triggered
  • catched with the except keyword: it is catched and a fix is provided

When it is catched, a fix (workaround) is executed:

For instance, when value = a / b fails because b = 0, you might want to pursue the execution with value = 0.

Propagation of exceptions

At runtime, if an exception is raised but not catched, it is automatically propagated to the calling function.

If it is not catched there either, it goes up again ... and again ...

If it reaches the top of the interpreter without being catched, the interpreter exits.

Propagation is one of the main benefits of exceptions that allows to find the right balance between:

  • no error management
  • all functions calls individually tested for errors

Compared to a custom handling of errors, exceptions have the following benefits:

  • they propagate automatically: allowing to provide general workarounds for large parts of code
  • they are typed and all exception types are hierarchized

Common exception types

  • ValueError: value error (e.g. square root of a negative number)
  • TypeError: type error (e.g. adding an int with a str)
  • IndexError: access to an index exceeding the list size
  • KeyError: access to a dictionary key that does not exist
  • NameError: access to an undeclared variable name or function name
  • IOError: I/O error (e.g. corrupted data, unexpected end of file...)
  • FileNotFoundError: file not found
  • RuntimeError: error happening at runtime (parent of all of the above)
  • SyntaxError: bad syntax (indentation, unexpected keywords, ...)
  • KeyboardInterrupt: received SIGINT signal (Ctrl +C)

The try/except block

The basic syntax to catch an exception:

try:
    protected_code()  # Raises IOError
except IOError:
    subtitution_code()

What happens at runtime:

try:
    protected_code()  # Raises IOError
    skipped_code()
    skipped_code2()
except IOError:
    substituted_code()
    substituted_code2()
resumed_code()
resumed_code2()

Other uses of exceptions

try: # Different exception types associated to the same substitution block
    protected_code()
except IOError, FileNotFoundError:
    substitution_code()
try:         # Different substitution blocks for different exception types
    protected_code()
except IOError:
    substitution_code1()
except FileNotFoundError:
    substitution_code2()
if some_positive_value < 0:     # Trigger an exception by yourself
    raise ValueError("Negative values are not authorized")

Import and use installed libraries

Use import or from...import statements to import and use installed libraries:

import math
value = math.sqrt(25)

One can define an alias to a module:

import math as m
value = m.sqrt(25)

With import, resources are accessible by prefixing it by the name of the module.

from math import sqrt
value = sqrt(25)

Resources loaded with from...import are in global scope and require no prefix

CHAPTER 2

MODULES, PACKAGES AND LIBRARIES

Difference between modules and packages

A module is a Python file, e.g. some/folder/mymodule.py. The module name uses the dotted notation to mirror the file hierarchy: some.folder.mymodule

Either the module is made to be:

  • executed from a shell: it is a script: python some/folder/mymodule.py
  • imported from another module: import mymodule (need to be installed in sys.path, see p55)

A Python package is a folder containing modules and optional sub-packages: some is a package, folder is a sub-package.

Scripts : the shebangs

On UNIX OSes a shebang is a header of a Python script that tells the system shell which interpreter is to be called to execute this Python module.

Invoke the env command to fetch the suitable interpreter for python3 with:

#!/usr/bin/env python3

Direct call to the interpreter is possible but NOT recommended, since it will force the interpreter by ignoring any virtual environment you could be in:

#!/usr/local/bin/python3

ℹ️ The Windows shell ignores shebangs, but you should provide them anyway.

Structure of Python packages

  • Packages and sub-packages allow to bring a hierarchy to your code
  • The package's hierarchy is inherited from the files-and-folders hierarchy
  • Modules hold resources that can be imported later on, e.g.:
    • Constants
    • Classes
    • Functions...
  • All packages and sub-packages must contain an __init__.py file each
  • In general __init__.py is empty but may contain code to be executed at import time

Then the package or subpackages can be imported:

import my_math.trigo
my_math.trigo.sin.sinus(0)
import my_math.trigo.sin as my_sin
my_sin.sinus(0)

Specific resources can also be imported:

from my_math.matrix.complex.arithmetic import product
sixteen = product(4, 4)

Relative imports (Imports internal to a package)

Relative import from the same folder:

from .my_math import my_sqrt
value = my_sqrt(25)

Relative import from a parent folder:

from ..my_math import my_sqrt
value = my_sqrt(25)
  • Do not put any slash such as import ../my_math
  • Relative imports can fetch . (same dir), .. (parent), ... (parent of parent)
  • Relative imports are forbidden when run from a module outside a package
  • Using absolute imports instead of relatives could result in name collisions

The Python path

The interpreter seeks for absolute import statements in the Python path sys.path.

This is a regular Python list and it can be modified at runtime (with append) to add paths to your libs.

The Python Package Index (PyPI.org)

pypi.org is a global server that allows to find, install and share Python projects.

pypi.org is operated by the Python Packaging Authority (PyPA): a working group from the Python Software Foundation (PSF).

The command-line tool Package Installer for Python (pip) can be used to install packages by their name, e.g. bottle. It can install from various sources (Link to code repos, ZIP file, local server...) and seeks on PyPI if no source is given:

pip install git+https://gitlab.com/bottlepy/bottle
pip install https://gitlab.com/bottlepy/bottle/archive/refs/heads/master.zip
pip install path/to/my/python/package/folder/
pip install path/to/my/python/package/zip/file.zip
pip install numpy    # Will seek on PyPI
pip install numpy==1.21.5   # Force a specific version
pip uninstall numpy

Non-installable Python projects usually have a file requirements.txt at their root

# requirements.txt
redis==3.2.0
Flask
celery>=4.2.1
pytest

pip has the following options:

  • pip install -r requirements.txt to install all dependencies form the file
  • pip freeze > requirements.txt to create a file of frozen versions

💡 installable packages have no such file but specify dependencies elsewhere (e.g. in pyproject.toml for installable packages using setuptools).

PyPI Security warning 🚨

PyPI packages caught stealing credit card numbers & Discord tokens

Perform sanity checks before installing a package

  • Is the package still maintained and documented?
Last update: November, 2017
  • Does the developer consider bugs and improvements?
# of solved GitLab issues
  • Is the package developer reliable?
Moral entity or individual, which company, experience...
  • If not opensource, is the development of this package likely to continue?
# of opensource users, # of clients, company financial health if not opensource, ...

PyPI Typosquatting warning 🚨

pip install -r requirements.txt
# 🚨 pip install requirements.txt

pip install rabbitmq
# 🚨 pip install rabitmq

pip install matplotlib
# 🚨 pip install matploltib

Virtual environments (venv)

Context: All installed packages go into the site-packages directory of the interpreter.

The venv module provides support for creating lightweight “virtual environments” with their own site directories, optionally isolated from system site directories.

Each virtual environment has its own Python binary (which matches the version of the binary that was used to create this environment) and can have its own independent set of installed Python packages in its site directories.

🐍 Learn more

For each new project you create/clone, create it its own dedicated virtual environment:

/usr/bin/python3.9 -m venv dev/PythonTraining/venv

Then, every time you work on this project, activate its environment first:

source PythonTraining/venv/bin/activate

Your terminal must prefix the prompt with the name of the env:

(venv) yoan@humancoders ~/dev/PythonTraining $

And quit the venv every time you stop working on the project:

(venv) yoan@humancoders ~/dev/PythonTraining $ deactivate
yoan@humancoders ~/dev/PythonTraining $ 

In an activated venv, every call to the interpreter and every package installation will target the isolated virtual environment:

(venv) yoan@humancoders ~/dev/PythonTraining $ python

will run the Python version targeted by the venv

(venv) yoan@humancoders ~/dev/PythonTraining $ pip install numpy

will install the latest numpy version into the venv

In practice, your IDE can handle venv creation, activation and deactivation automatically for you when you create or open/close a project.

🆕 PEP 668

You can no longer use pip to install packages outside a venv.

You can override this behaviour by passing --break-system-packages.

Testing

  • Packages pytest and unittest are frequently used to test Python apps
  • unittest relies on the regular test framework:
    • Setup: Prepare every prerequisite for the test
    • Call: call the tested function with input parameters setuped before
    • Assertion: an assert is a ground truth that must be true
    • Tear down: Cleanup everything that has been created for this test
  • pytest is a light test framework
  • On top of these, tox allows to run tests in multiple environments (e.g. Python versions)

Test files are sometimes placed in a tests/ directory, file names are prefixed with test_*.py and test function names are also prefixed with test_

pyproject.toml
mypkg/
    __init__.py
    app.py
    view.py
tests/
    test_app.py
    test_view.py
    ...

Naming tests according to these conventions will allow auto-discovery of tests by the test tool: it will go through all directories and subdirectories looking for tests to execute.

# water_tank.py
class WaterTank:
    def __init__(self):
        self.level = 10
    def pour_into(self, recipient_tank: "WaterTank", quantity: int):
        self.level -= quantity
        recipient_tank.level += quantity
# test_water_tank.py
from water_tank import WaterTank
def test_water_transfer():
    a = WaterTank()
    b = WaterTank()
    a.pour_into(b, 6)
    assert a.level == 4 and b.level == 16 

Then just type pytest and the test report will be printed in the terminal!

============ 1 test passed in 0.01s ============

Object-Oriented Programming (O.O.P.)

Here is a program to handle the sales of an apartment:

apartment_available = True
apartment_price = 90000

def sell():
   apartment_available = False

def reduce_price(percentage=5):
   apartment_price = apartment_price * (1-percentage/100)

Note: because of the scope of variables, global variables would be required here

In classic programming, these are variables...

apartment_available = True
apartment_price = 90000

... and these are functions:

def sell():
   apartment_available = False

def reduce_price(percentage=5):
   apartment_price = apartment_price * (1-percentage/100)

However, functions usually manipulate on data stored in variables. So functions are linked to variables.

In Object-Oriented Programming, variables and functions are grouped into a single entity named a class that behaves as a data type:

class Apartment:
    def initialize_variables():
        apartment_available = True
        apartment_price = 90000

    def sell():
        apartment_available = False

    def reduce_price(percentage=5):
        apartment_price = apartment_price * (1-percentage/100)

Note: this intermediary explanation is not yet a valid Python code snippet

Object-Oriented Programming introduced specific vocabulary:

Types are called classes:

class Apartment:   

Functions are called methods:

    def sell():    

Variables are called attributes:

        apartment_available = False

Since the declaration of a class defines a new type (here, Apartment), the program can declare several independant apartments:

apartment_dupont = Apartment()
apartment_muller = Apartment()

apartment_dupont.reduce_price(15)
apartment_muller.reduce_price(7)
apartment_dupont.sell()
apartment_muller.reduce_price(3)
apartment_muller.sell()
apartment_dupont = Apartment()

In this statement:

  • Apartment is a class
  • apartment_dupont is an object (an instance of a class)
  • Apartment() is the constructor (the method creating an object out of a class)
apartment_dupont.reduce_price(15)

This statement is a method call on object apartment_dupont.

Method calls can create side effects to the object (modifications of its attributes).

Like regular functions, methods can take parameters in input. Here, an integer, 15.

The self object

  • self is the name designating the instanciated object
  • self is implicitly passed as the first argument for each method call
  • self can be read as "this object"

In other languages like Java or C++, self is named this.

The constructor

The constructor is the specific method that instanciates an object out of a class. It is always named __init__.

class Test:
    def __init__(self):
        self.attribute = 42

Here is now a valid Python syntax for our class.

This is the class declaration:

class Apartment:
    def __init__(self):       # Implicit first parameter is self
        self.available = True       # We are creating an attribute in self
        self.price = 90000

    def sell(self):
        self.available = False

    def reduce_price(self, percentage=5):
        self.price = self.price * (1-percentage/100)

This is the class instanciation:

apart_haddock = Apartment()

The constructor, like any other method, can accept input parameters:

class Apartment:
    def __init__(self, price):
        self.available = True	
        self.price = price

apart_dupont = Apartment(120000)    # Now the price is compulsory
apart_haddock = Apartment(90000)

While attributes are accessed using the prefix self. from inside the class...

...they can be accessed from outside the class, using object name as the prefix:

print(f"This flat costs {apart_haddock.price}")
apart_haddock.available = False

However some attributes may have a protected or private scope:

class Foo:
    def __init__(self):
        self.public = 0
        self._protected = 0
        self.__private = 0        # ⚠ Name mangling applies here

Protected attributes are not enforced but private ones rely on name mangling:

class BankAccount:
    def __init__(self):
        self.__balance = 3000
         
class Client:
    def make_transaction(self, bank_account: "BankAccount"):
        bank_account.__balance += 1000
         
Client().make_transaction(BankAccount())
# AttributeError: 'BankAccount' object has no attribute '_Client__balance'

Inheritance

A furnished apartment is the same as an Apartment. But with additional furniture.

class FurnishedApartement(Apartment):   # The same as an Apartment...
   def __init__(self, price):
	   self.furnitures = ["bed", "sofa"]  # ...but with furniture	
	   super().__init__(price)


furnished_apart = FurnishedApartment(90000)
furnished_apart.available = False
furnished_apart.reduce_price(5)
furnished_apart.furnitures.append("table")

The super() function allows to call the same method in the parent class.

Note: Former Pythons require a longer syntax: super(CurrentClassName, self)

Terms from object-oriented programming (O.O.P.) to remember:

  • A class is a type owning attributes and methods
  • An object is an instance of a class
  • Instanciating a class consists into building an object from this class
  • The constructor is the method initializing the object: __init__()
  • An attribute is a variable from a class (or from an object)
  • A method is a function from a class (or from an object)
  • A (child) class may inherit from another (parent)
  • A child class method may override the same name method in the parent class

Magic methods (aka dunder methods = double-underscore methods)

  • apart1 + apart2 → Apartment.__add__(self, other) → Addition
  • apart1 * apart2 → Apartment.__mul__(self, other) → Multiplication
  • apart1 == apart2 → Apartment.__eq__(self, other) → Equality test
  • str(apart) → Apartment.__str__(self) → Readable string
  • repr(apart) → Apartment.__repr__(self) → Unique string

Magic methods reading or altering attributes:

  • getattr(apart, "price")Apartment.__getattr__(self, name)
  • setattr(apart, "price", 10)Apartment.__setattr__(self, name, val)
  • delattr(apart, "price")Apartment.__delattr__(self, name)

This is why Python's duck typing does not rely on nominative types.

In [1]: dir(int)

Out[1]: 
['__abs__', '__add__', '__and__', '__bool__', '__ceil__', '__class__', 
'__delattr__', '__dir__', '__divmod__', '__doc__', '__eq__', '__float__', 
 '__floor__', '__floordiv__',  '__format__', '__ge__', 
 '__getattribute__', '__getnewargs__',  '__gt__', '__hash__', '__index__', 
  '__init__', '__init_subclass__',   '__int__', '__invert__', '__le__', 
  '__lshift__', '__lt__', '__mod__',   '__mul__', '__ne__', '__neg__', 
  '__new__', '__or__', '__pos__',   '__pow__', '__radd__', '__rand__', 
  '__rdivmod__', '__reduce__',   '__reduce_ex__', '__repr__', 
  '__rfloordiv__', '__rlshift__',   '__rmod__', '__rmul__', '__ror__', 
  '__round__', '__rpow__',   '__rrshift__', '__rshift__', 
  '__rsub__', '__rtruediv__',   '__rxor__', '__setattr__', 
  '__sizeof__', '__str__', '__sub__',   '__subclasshook__', '__truediv__', 
   '__trunc__', '__xor__',    'as_integer_ratio', 'bit_length', 
   'conjugate', 'denominator',    'from_bytes', 'imag', 'numerator', 
   'real', 'to_bytes']

CHAPTER 3

GO FURTHER: THE PYTHON ECOSYSTEM

  • math, datetime, random, re: math, time, random generation & regexes tools
  • sys: communicate with the interpreter (args, stdin, script exit, path ...)
  • os: communicate with the OS (file access, low level fd , os-specific ...)
  • logging: handle log files and streams with different levels and filters
  • pathlib: handle file paths, discriminate files & folders, check file existence...
  • json, csv: (de)serialize data in format JSON/CSV
  • requests: emit synchronous HTTP requests
  • subprocess: open a child process from Python
  • argparse: access, typing and management of script parameters
  • asyncio: Asynchronous I/Os with coroutine-based concurrent code (promises)

To be installed with pip if needed: pip install <libname>

Other non-builtin libraries for code quality

To be installed with pip if needed: pip install <libname>

  • sphinx: build beautiful documentation out of code documentation
  • pytest, unittest: test frameworks
  • tox: test automation (e.g. Continuous Integration)
  • mypy: static type checker for annotated code (see PEP 484 Type Hints)
  • pytype: type checker for unannotated code
  • pylint, pyflakes: Syntactic and semantic checkers (linters)
  • pep8: Style checker
  • poetry: Deterministic package-and-dependency manager

The Python Enhancement Proposals (PEP) and the PEP 8

Python versions

Brief focus on Python 2 (if relevant)

Python 2 is no longer supported since January, 2020: forget it!

Main differences Python 2 → Python 3:

  • print "Hello"print("Hello")
  • raise ValueError, "problem!"raise ValueError("problem!")
  • type(1/3) == inttype(1/3) == float
  • str (encoded)/unicode (decoded) → str (decoded)/bytes (encoded)
  • # encoding: utf8 (ASCII by default) → (UTF-8 by default)
  • zip([0,1], [8,9]) == listzip([0,1], [8,9]) == zip_object (generator)

Where to find Python documentation?

All builit-in libraries are documented in 8 languages on:
📖 docs.python.org
This documentation is community-written.

Non-builtin libraries (e.g. those that you install via pip) usually have their documentation on their own web server.

Also, readthedocs.io is a common place for some of them.

Conférence annuelle francophone

Rendez-vous sur https://pycon.fr

Retrouvez le replay des conférences 2023

Charset and encoding

All text assets (Python string or .py file, .json file, .txt file…) are encoded using a charset, a correspondance table between Bytes ↔ Actual character

  • e.g. in utf-8: 0xC3A9 ↔ é
  • e.g. in latin-1: 0xE9 ↔ é

You MUST know the encoding of a text asset in order to read it.

If you do not, it can be guessed but there is a chance to make mistakes.

This is what happens if you do not specify explicit encodings and rely on default parameters of file reading libraries.

If the guess is wrong it may result in e.g. hétérogène instead of hétérogène.

Rule of thumb for encoding and decoding

  • I RECEIVE data coming IN the interpreter (from stdin, the network, a file...):

    • If it is a str: it has already been decoded by reading functions
      (Prey that they used the right charset 🙏)
    • If it is a bytes: decode-it with the charset declared by the source e.g.
      data.decode("utf-8") if the source sends UTF-8 strings
  • My Python code must operate only on Unicode (Python type str)

  • I SEND data OUT of the interpreter (to stdout, to the network, to a file...):

    • If it is a bytes: it has already been encoded by writing functions
      (Prey that they used the right charset 🙏)
    • If it is a str: encode-it with the charset declared by the recipient e.g.
      data.encode("utf-8") if the recipient expects UTF-8 strings

What is Unicode?

Unicode is NOT a charset, this is the global table of all world code points.
e.g. U+1F601: 😁

Encodings (ASCII, latin-1, UTF-8...) may be able to code Unicode in whole or in part.

ASCII and latin-1 can only code a subset of Unicode (resp 128 & 256 code points).

UTF-8, UTF-16 and UTF-32 can code all Unicode code points.

The difference between UTF-8, 16 and 32 is about how they code characters:

  • UTF-8 uses a variable number of bytes
  • UTF-16 uses a variable number of bytes
  • UTF-32 uses a fixed number of 4 bytes

In average, UTF-16 is more efficient for Asian texts compared to UTF-8. But UTF-8 is more widely recommanded as a global standard.

  • The Python interpreter holds:
    • decoded unicode strings in type str
    • encoded strings in type bytes
  • encode() and decode() methods on these objects can convert them between bytes and str
  • stdout and stderr in your terminal are outputs, they have a charset
  • stdin in your terminal is an input, it has a charset
  • Your .py file itself is an input, it has a charset