Skip to the content. Back to Chapter 1

Sorting Algorithms

Theory before hands-on session

O(n2) algorithms

- Simple but complex -

Bubble sort algorithm

def bubble_sort(to_sort: list) -> None:
    """Sort the list in-place"""
    num_items = len(to_sort)
    for i in range(num_items - 1):
        swap = False
        for j in range(num_items - i - 1):  # optimisation
            if to_sort[j] > to_sort[j + 1]:
                to_sort[j], to_sort[j + 1] = to_sort[j + 1], to_sort[j]
                swap = True
        if not swap:
            break

Insertion Sort

def insertion_sort(to_sort: list) -> None:
    """Sort the list in-place"""
    n = len(to_sort)
    for i in range(1, n):
        cur = to_sort[i]
        pos = i
        while pos > 0 and to_sort[pos - 1] > cur:
            to_sort[pos] = to_sort[pos - 1]
            pos -= 1
        to_sort[pos] = cur
  1. Describe how an insertion sort works.
    • An insertion sort works by separating the list into sorted and unsorted segments, and moving items from the unsorted segment into the sorted segment in their correct place.
    • The sorted segment initially contains one item; the first item in the list.
    • Each element in turn in the unsorted segment (all other items) get swapped into place in the sorted segment of the list such that after each pass the sorted segment is always sorted.
  2. What is the first name to be moved? What will the list look like after this name is moved?
    • The first name to be moved is Ahmed, in the third pass; the list will contain [Ahmed, George, Jane, Miranda], [Sophie, Bernie, Keith] after this pass
  3. What is the second name to be moved? What will the list look like after this name is moved?
    • The second name to be moved is Bernie, in the fifth pass; the list will contain [Ahmed, Bernie, George, Jane, Miranda, Sophie], [Keith] after this pass
  4. What is the sequence of the list after the first pass is completed?
    • [George, Jane], [Miranda, Ahmed, Sophie, Bernie, Keith]
  5. What is the best/average/worst time complexity of the insertion sort?
    • Best case: O(n)
    • Average case: O(n2)
    • Worst case: O(n2)

Very inefficient algorithms

Ineffective Sorts

StackSort connects to StackOverflow, searches for 'sort a list', and downloads and runs code snippets until the list is sorted

Algorithms

There was a schism in 2007, when a sect advocating OpenOffice created a fork of Sunday.xlsx and maintained it independently for several months. The efforts to reconcile the conflicting schedules led to the reinvention, within the cells of the spreadsheet, of modern version control.

Algorithm Best-case Time Complexity Average-case Time Complexity Worst-case Time Complexity Space Complexity
Bubble Sort O(n) O(n2) O(n2) O(n) total, O(1) auxiliary
Insertion Sort O(n) O(n2) O(n2) O(n) total, O(1) auxiliary
Merge Sort O(nlogn) O(nlogn) O(nlogn) O(n) total, O(n) auxiliary (my in-place algorithm) / O(nlogn) total, O(nlogn) auxiliary (naive merge sort)
Quick Sort O(nlogn) O(nlogn) O(n2) O(n) total, O(1) auxiliary

O(nlogn) algorithms

- Less simple but less complex -

Merge Sort

def inplace_merge_sort(to_sort: list, start: int = 0, end: int = -1) -> None:
    """Sort the list in-place"""
    if end == -1:
        end = len(to_sort)
    if start == end - 1:
        # 1 element; sublist is sorted
        return
    mid = (end - start) // 2 + start
    # sort the two sublists before combining
    inplace_merge_sort(to_sort, start, mid)
    inplace_merge_sort(to_sort, mid, end)
    # combine
    pos = start
    # I can't remember the really nice way of doing in-place merge sort;
    # so I'm doing the best I can
    # space complexity increases to O(n) / O(n) total / auxiliary
    # using a reversed list here allows the merge sort to remain O(nlogn)
    # as the pop operation is O(1) compared to pop(0) [O(n)]
    left_copy = to_sort[start:mid]
    left_copy.reverse()
    left_optimise = True
    while left_copy and (mid < end):
        if left_copy[-1] < to_sort[mid]:
            elem = left_copy.pop()
            # if no elements from the right section have been used, don't copy
            # the element over itself
            if not left_optimise:
                to_sort[start] = elem
            start += 1
        else:
            left_optimise = False  # can't optimise the left list anymore
            to_sort[start] = to_sort[mid]
            start += 1
            mid += 1
    # either left_copy is empty and the right side is placed correctly
    # (and the body of this loop will never run), or all the elements from
    # the right side have been used and now the left side must be placed there
    left_copy.reverse()
    for elem in left_copy:
        to_sort[start] = elem
        start += 1


def merge_sort(to_sort: list) -> list:
    """Sort and return the list (does not modify the original list)"""
    items = len(to_sort)
    if items == 1:
        # don't return the same list in case it's the user's list
        # technically O(n) but n = 1 so it's O(1)
        return to_sort.copy()
    mid = items // 2
    # reversal = one-time cost of O(n) + n-time cost of O(1)
    left = merge_sort(to_sort[:mid])
    left.reverse()
    right = merge_sort(to_sort[mid:])
    right.reverse()
    out = []
    while left and right:
        if left[-1] < right[-1]:
            out.append(left.pop())
        else:
            out.append(right.pop())
    # either left or right is empty at this point
    # the other contains the last elements of the list
    # empty both lists; one loop won't run at all
    while left:
        out.append(left.pop())
    while right:
        out.append(right.pop())
    return out

Merge Sort Visualisation

  1. Explain the key steps in the merge sort algorithm

    • If the list contains one element, return it
    • Split the list into two halves
    • Merge sort them (recursively)
    • Merge the two lists by comparing the smallest element of each list and adding it to the output list
  2. The following list of numbers is to be sorted using a merge sort:

    [54, 36, 66, 78, 64, 19, 42, 44, 51, 89, 72, 62, 22, 67, 81, 79]

    Which answer below shows the first two lists to be merged?

    1. [44] and [51]
    2. [54] and [36]
    3. [54, 36] and [66, 78]
    4. [19, 36, 42, 44, 54, 64, 66, 78] and [22, 51,62, 67, 72, 79, 81, 89]
  3. What is the best/average/worst time complexity of merge sort?

    • All cases: O(nlogn)
  4. What about the space complexity?

    • Best implementation: O(n) total, O(1) auxiliary
    • My implementation: O(n) total, O(n) auxiliary
    • Naive implementation: O(nlogn) total, O(nlogn) auxiliary

Quick Sort

  1. Explain the key steps in the quick sort algorithm
    • Pick a pivot
    • Move all values less than the pivot to before the pivot
    • Move all values greater than the pivot to after the pivot
    • Sort each partition recursively
  2. What is the best/average/worst time complexity of quick sort?
    • Best case: O(nlogn)
    • Average case: O(nlogn)
    • Worst case: O(n2)
  3. What about the space complexity?
    • O(n) total, O(1) auxiliary