Fuel Your Dev

Saturday, February 29, 2020

Answer to Effective Python Exercise: Wizardly Wiles

In the last post we set up a problem to solve with a few bugs and a couple of things to optimize. Our sales golem is not working at optimal efficiency, and so we've set out to fix that. But before we get to that, let's explore some of the problem cases. First up, if we run this code:

customer = HumanCustomer(["4 grams of magic powder"])
print(golem.help_customer(customer))

We will see this output:

('0.00', [4 grams of magic powder])

Which shows us that the order for 4 grams of magic powder was fulfilled, but the price charged to the customer was 0.00. That's not acceptable. We can't be giving things away for free.

This problem is fairly straightforward to fix. in the _process_payment method, we'll want to switch from using float to Decimal, and we'll set the Decimal to round up. Also note that for the best accuracy, it's typically best to instantiate a Decimal using a string. The code below should fix this problem.

    def _process_payment(self, order: List[Union[MagicPowderContainer, Book, None]]) -> str:
        price = Decimal("0")
        for item in order:
            if isinstance(item, MagicPowderContainer):
                price += Decimal(item.measure()) * \
                         Decimal(magic_powder_price_per_gram_sign)
            elif isinstance(item, Book):
                price += Decimal(item.price)
        price = price.quantize(Decimal("0.01"), rounding=ROUND_UP)
        return f"{price:.2f}"

Once this code change is in place, then running the example from above will output the following:

('0.01', [4 grams of magic powder])

The next problem is a little tricky, but take a look at what you get when you run this code:

print(f"Magic powder: {magic_powder_container.measure()}")
customer = HumanCustomer(["100000 grams of magic powder"])
print(golem.help_customer(customer))
print(f"Magic powder: {magic_powder_container.measure()}")
customer = OrcCustomer(["Magical Maladies"])
print(golem.help_customer(customer))
print(f"Magic powder: {magic_powder_container.measure()}")
magic_powder_container.fill(100000)
print("Filled")
print(f"Magic powder: {magic_powder_container.measure()}")
customer = HumanCustomer(["10 grams of magic powder"])
print(golem.help_customer(customer))
print(f"Magic powder: {magic_powder_container.measure()}")
customer = OrcCustomer(["10 grams of magic powder"])
print(golem.help_customer(customer))
print(f"Magic powder: {magic_powder_container.measure()}")
customer = HumanCustomer(["10 grams of magic powder"])
print(golem.help_customer(customer))
print(f"Magic powder: {magic_powder_container.measure()}")

It outputs the following:

Magic powder: 100000
('100.00', [100000 grams of magic powder])
Magic powder: 0
('0.00', [])
Magic powder: 0
Filled
Magic powder: 100000
('0.01', [10 grams of magic powder])
Magic powder: 99989
('0.01', [10 grams of magic powder])
Magic powder: 99978
('0.01', [10 grams of magic powder])
Magic powder: 99968

So what we're seeing here is the first customer comes in and places an order for all of your magic powder. The order gets fulfilled and the customer is charged 100.00.

After this, the next customer, an orc, comes and tries to place an order. Because the magic powder is gone, the golem is unable to use the translation amulet, and is unable to fulfill the customer's order. The customer is given nothing, and is charged nothing. After this the magic powder container gets refilled (undoubtedly because that last orc customer came and complained to you about the service, at which point you realized that the magic powder had run out, and you refilled it from out of the back room).

After this the golem serves the next customer, a human, who asks for 10 grams of magic powder. After this interaction you check the magic powder container, expecting to find it 10 grams lighter, but instead it is 11 grams lighter. What happened? As it turns out, when the golem determined that it was unable to translate for the previous orc customer, it also failed to turn the translation amulet off. So the translation amulet is left on, and will stay on, until a translation is actually needed (such as with the next customer, which is an orc, and 11 grams are removed as expected) at which point in time the golem will finally turn it off, and things will start working as expected (such as with the customer after that, which is human, and 10 grams are removed as expected).

To fix this, after activating the translation amulet, the code that will end up using the amulet should be in a try block, and the code that deactivates the amulet should be in a finally block, like this:

    def _ask_for_order(self, customer: Customer) -> List[str]:
        try:
            if customer.race != "human":
                translation_amulet.activate()
                try:
                    order = customer.get_order()
                finally:
                    translation_amulet.deactivate()
            else:
                order = customer.get_order()
            return order
        except:
            return []

Now it's worth noting at this point that this is not the only place that uses the translation amulet. It is also used for translating the titles of books. We could also add a try finally block there, but at this point it might be worthwhile to consider creating a function using the contextmanager annotation so that we can use the amulet via a with statement. For instance, if we were to create the following function:

@contextmanager
def use_amulet(amulet: TranslationAmulet):
    amulet.activate()
    try:
        yield
    finally:
        amulet.deactivate()

Then the _ask_for_order method could be changed to this:

    def _ask_for_order(self, customer: Customer) -> List[str]:
        try:
            if customer.race != "human":
                with use_amulet(translation_amulet):
                    order = customer.get_order()
            else:
                order = customer.get_order()
            return order
        except:
            return []

And the _find_book method could be changed to this:

    def _find_book(self, title: str) -> Union[Book, None]:
        for book in books:
            if book.language != "english":
                with use_amulet(translation_amulet):
                    book_title = book.get_title()
            else:
                book_title = book.get_title()
            if title == book_title:
                books.remove(book)
                return book
        return None

At this point if we were to rerun the example from above, we'd get the following output:

Magic powder: 100000
('100.00', [100000 grams of magic powder])
Magic powder: 0
('0.00', [])
Magic powder: 0
Filled
Magic powder: 100000
('0.01', [10 grams of magic powder])
Magic powder: 99990
('0.01', [10 grams of magic powder])
Magic powder: 99979
('0.01', [10 grams of magic powder])
Magic powder: 99969

You can now see that when the third customer is served, the magic powder is reduced by only 10 grams instead of the 11 grams that it was before.

The final issue that we'll take a look at isn't a bug, but rather it's an optimization issue. When the sales golem is looking for a book to fill an order, it always starts with the first book and looks through all the books in order until it finds the book that it is looking for. Considering that the bookshelf is alphabetized, this is slow, and not only is is slow, it will use up extra magic powder translating more books than it needs to. Take for instance this example:

print(f"Magic powder: {magic_powder_container.measure()}")
customer = HumanCustomer(["Wand Maintenance"])
print(golem.help_customer(customer))
print(f"Magic powder: {magic_powder_container.measure()}")

This will print out:

Magic powder: 100000
('2.37', [Wand Maintenance])
Magic powder: 99996

Notice that it had to use 4 grams of magic powder to translate the titles of the orcish books "Adventures of Thorg" and the three copies of "Sorcery 101" before it was able to locate the book that it was looking for.

On the other hand, had it been using a binary search, it would have only needed to use 1 gram of magic powder, since it would have started with the book in the middle (one of the copies of "Sorcery 101"), used a gram of magic powder to translate that title, at which point it would realize the the book needed will be to the right, and it will rule out all books to the left, then move to the middle book of the books to the right, and find the copy of "Wand Maintenance" and be done.

So how can we do this in Python? Unfortunately it's not as straightforward as I'd like it to be for this case. Python has the functions bisect and bisect_left, which will perform this kind of search, but these functions do not currently allow for a key parameter to be passed in in similar fashion to how many other Python functions work. (Note that there have been conversations to add this functionality to the bisect functions, and there is a code being written for it. See here for the conversation history, and here for the code.) If it did, then we could simply:

    # This won't work because bisect_left doesn't have a key parameter
    def _find_book(self, title: str) -> Union[Book, None]:
        book_title = None
        
        def key(book: Book):
            nonlocal book_title
            if book.language != "english":
                with use_amulet(translation_amulet):
                    book_title = book.get_title()
            else:
                book_title = book.get_title()
            return book_title
        
        index = bisect_left(books, title, key=key)
        if book_title == title:
            book = books.pop(index)
            return book
        else:
            return None

But since it doesn't, what are some options that we could do to make this work? The two primary recommendations are to either 1) maintain a separate list in parallel to the list that you need to search, which in this case would mean having the sales golem maintain a list of the translated titles of the books separate from the books themselves. Or 2) have the object implement the __lt__ method, which in this case would mean adding the __lt__ method to the Book class.

So long as we're sticking to the strictest sense of the rules for the problem, then we would be unable to modify the Book class, which would mean going with option 1, which would potentially look something like this:

class SalesGolem:
    def __init__(self):
        self._book_titles = []
        for book in books:
            if book.language != "english":
                with use_amulet(translation_amulet):
                    book_title = book.get_title()
            else:
                book_title = book.get_title()
            self._book_titles.append(book_title)
    
    ...
    
    def _find_book(self, title: str) -> Union[Book, None]:
        index = bisect_left(self._book_titles, title)
        book_title = self._book_titles[index]
        if book_title == title:
            book = books.pop(index)
            self._book_titles.pop(index)
            return book
        else:
            return None
    
    ...

If we run this with the previous example, we will get the following output:

Magic powder: 99995
('2.37', [Wand Maintenance])
Magic powder: 99995

You'll notice that this is even more expensive than the previous version, using 5 grams of magic powder because it also needed to translate the title for the orcish book "What's the Difference Between a Wizard and a Warlock?", and it had to do all this before even helping the customer. Now, for helping a single customer, this is clearly worse, but the benefits to it would add up over the course of helping multiple customers because all title translations happened up front, and does not need to be repeated for each individual customer. For instance, if we run this scenario:

print(f"Magic powder: {magic_powder_container.measure()}")
customer = HumanCustomer(["Wand Maintenance"])
print(golem.help_customer(customer))
print(f"Magic powder: {magic_powder_container.measure()}")
customer = HumanCustomer(["The Vanishing Book on Vanishing"])
print(golem.help_customer(customer))
print(f"Magic powder: {magic_powder_container.measure()}")
customer = HumanCustomer(["Sorcery 101"])
print(golem.help_customer(customer))
print(f"Magic powder: {magic_powder_container.measure()}")
customer = HumanCustomer(["Magical Maladies"])
print(golem.help_customer(customer))
print(f"Magic powder: {magic_powder_container.measure()}")

Then you'll get the following output:

Magic powder: 99995
('2.37', [Wand Maintenance])
Magic powder: 99995
('0.00', [None])
Magic powder: 99995
('15.47', [Sorcery 101])
Magic powder: 99995
('9.99', [Magical Maladies])
Magic powder: 99995

Notice that outside of the up front cost, no further cost is incurred, so this can work well, if you expect a lot of customers to be buying books. But this does still have a problem. What happens if you add another book to the shelf? Well then the sales golem's internal list of titles gets out of sync, and without resyncing the list, the golem could end up giving the customer the wrong book. Take a look at this scenario:

print(f"Magic powder: {magic_powder_container.measure()}")
customer = HumanCustomer(["Wand Maintenance"])
print(golem.help_customer(customer))
print(f"Magic powder: {magic_powder_container.measure()}")
customer = HumanCustomer(["The Vanishing Book on Vanishing"])
print(golem.help_customer(customer))
print(f"Magic powder: {magic_powder_container.measure()}")
print(f"Adding orc book, Arcane Arts")
books.insert(1, OrcishBook("Arcane Arts", "13.33"))
print(f"Magic powder: {magic_powder_container.measure()}")
customer = HumanCustomer(["Sorcery 101"])
print(golem.help_customer(customer))
print(f"Magic powder: {magic_powder_container.measure()}")
customer = HumanCustomer(["Magical Maladies"])
print(golem.help_customer(customer))

Which outputs:

Magic powder: 99995
('2.37', [Wand Maintenance])
Magic powder: 99995
('0.00', [None])
Magic powder: 99995
Adding orc book, Arcane Arts
Magic powder: 99995
('39.99', [See the Sites of Middle Earth])
Magic powder: 99995
('13.33', [Arcane Arts])
Magic powder: 99995

The last two customers both get the wrong books. This could be fixed by adding a way to resync the list, which could be something like this:

class SalesGolem:
    def __init__(self):
        self._book_titles = []
        self.sync_book_titles()

    def sync_book_titles(self):
        self._book_titles = []
        for book in books:
            if book.language != "english":
                with use_amulet(translation_amulet):
                    book_title = book.get_title()
            else:
                book_title = book.get_title()
            self._book_titles.append(book_title)

    ...

Then the example needs to be changed to allow for the resync:

print(f"Magic powder: {magic_powder_container.measure()}")
customer = HumanCustomer(["Wand Maintenance"])
print(golem.help_customer(customer))
print(f"Magic powder: {magic_powder_container.measure()}")
customer = HumanCustomer(["The Vanishing Book on Vanishing"])
print(golem.help_customer(customer))
print(f"Magic powder: {magic_powder_container.measure()}")
print(f"Adding orc book, Arcane Arts")
books.insert(1, OrcishBook("Arcane Arts", "13.33"))
golem.sync_book_titles()
print(f"Magic powder: {magic_powder_container.measure()}")
customer = HumanCustomer(["Sorcery 101"])
print(golem.help_customer(customer))
print(f"Magic powder: {magic_powder_container.measure()}")
customer = HumanCustomer(["Magical Maladies"])
print(golem.help_customer(customer))
print(f"Magic powder: {magic_powder_container.measure()}")

This will change the output to:

Magic powder: 99995
('2.37', [Wand Maintenance])
Magic powder: 99995
('0.00', [None])
Magic powder: 99995
Adding orc book, Arcane Arts
Magic powder: 99989
('15.47', [Sorcery 101])
Magic powder: 99989
('9.99', [Magical Maladies])
Magic powder: 99989

This gets us back to getting the customers the right books, but cost of resyncing is high, and if it's something that we have to do regularly, then this solution is less than ideal.

Another way to approach this is instead of putting the new book on the shelf and telling the golem to resync its internal list, we could instead add some functionality to the golem so that we can give the book to the golem, it can determine where it needs to be in it's internal list, and then it can add it to the bookshelf itself. We could this by adding the following method to the SalesGolem class:

    def add_book(self, book: Book):
        if book.language != "english":
            with use_amulet(translation_amulet):
                book_title = book.get_title()
        else:
            book_title = book.get_title()
        index = bisect_left(self._book_titles, book_title)
        self._book_titles.insert(index, book_title)
        books.insert(index, book)

And then changing the scenario to use this method:

print(f"Magic powder: {magic_powder_container.measure()}")
customer = HumanCustomer(["Wand Maintenance"])
print(golem.help_customer(customer))
print(f"Magic powder: {magic_powder_container.measure()}")
customer = HumanCustomer(["The Vanishing Book on Vanishing"])
print(golem.help_customer(customer))
print(f"Magic powder: {magic_powder_container.measure()}")
print(f"Adding orc book, Arcane Arts")
golem.add_book(OrcishBook("Arcane Arts", "13.33"))
print(f"Magic powder: {magic_powder_container.measure()}")
customer = HumanCustomer(["Sorcery 101"])
print(golem.help_customer(customer))
print(f"Magic powder: {magic_powder_container.measure()}")
customer = HumanCustomer(["Magical Maladies"])
print(golem.help_customer(customer))
print(f"Magic powder: {magic_powder_container.measure()}")

Which then has this output:

Magic powder: 99995
('2.37', [Wand Maintenance])
Magic powder: 99995
('0.00', [None])
Magic powder: 99995
Adding orc book, Arcane Arts
Magic powder: 99994
('15.47', [Sorcery 101])
Magic powder: 99994
('9.99', [Magical Maladies])
Magic powder: 99994

This is much better, it only needs to use 1 more gram of magic powder when the new Orcish book is added, and if an English book is added, it won't need to use any at all.

So is that it? Or could we do better? While the cached titles are nice, it would be better if we could get them lazily instead of eagerly. Is there a way that we could do this? Perhaps instead of having our internal title list being a list of strings that we have to pre populate before we can use it, we could instead fill it with proxy objects that will only get the titles as needed. So if we create this class:

class BookProxy:
    def __init__(self, book: Book):
        self._book = book
        self._title = None

    def get_title(self):
        if not self._title:
            if self._book.language != "english":
                with use_amulet(translation_amulet):
                    self._title = self._book.get_title()
            else:
                self._title = self._book.get_title()
        return self._title

    def __lt__(self, other):
        if isinstance(other, BookProxy):
            return self.get_title() < other.get_title()
        elif isinstance(other, Book):
            return self.get_title() < other.get_title()
        else:
            return self.get_title() < other

Then we can take the SalesGolem class and modify it as follows:

class SalesGolem:
    def __init__(self):
        self._book_titles = [BookProxy(book) for book in books]
    
    ...
    
    def add_book(self, book: Book):
        book_proxy = BookProxy(book)
        index = bisect_left(self._book_titles, book_proxy)
        self._book_titles.insert(index, book_proxy)
        books.insert(index, book)

    def _find_book(self, title: str) -> Union[Book, None]:
        index = bisect_left(self._book_titles, title)
        book_proxy = self._book_titles[index]
        if book_proxy.get_title() == title:
            book = books.pop(index)
            self._book_titles.pop(index)
            return book
        else:
            return None

    ...

Then we're able to run our scenario and get back the following output:

Magic powder: 100000
('2.37', [Wand Maintenance])
Magic powder: 99999
('0.00', [None])
Magic powder: 99998
Adding orc book, Arcane Arts
Magic powder: 99996
('15.47', [Sorcery 101])
Magic powder: 99995
('9.99', [Magical Maladies])
Magic powder: 99995

At which point everything is running as expected, and the total amount of magic powder used is even less. What's more, we've also managed to optimize for the single customer scenario as well. If we run the single customer scenario from above, then we get this output:

Magic powder: 100000
('2.37', [Wand Maintenance])
Magic powder: 99999

So with that, our final SalesGolem class (and supporting functions and classes) looks like this:

class SalesGolem:
    def __init__(self):
        self._book_titles = [BookProxy(book) for book in books]

    def help_customer(self, customer: Customer) -> (str, List[Union[MagicPowderContainer,
                                                                    Book,
                                                                    None]]):
        order_request = self._ask_for_order(customer)
        order = self._fulfill_order(order_request)
        payment = self._process_payment(order)
        return payment, order

    def _ask_for_order(self, customer: Customer) -> List[str]:
        try:
            if customer.race != "human":
                with use_amulet(translation_amulet):
                    order = customer.get_order()
            else:
                order = customer.get_order()
            return order
        except:
            return []

    def _fulfill_order(self, order_request: List[str]) -> List[Union[MagicPowderContainer,
                                                                     Book,
                                                                     None]]:
        order = []
        for item in order_request:
            if item.endswith(" grams of magic powder"):
                ordered_grams = int(item.split(" ")[0])
                try:
                    grams = magic_powder_container.take_magic_powder(ordered_grams)
                except:
                    grams = magic_powder_container.take_remainder()
                order.append(MagicPowderContainer(grams))
            else:
                try:
                    book = self._find_book(item)
                except:
                    book = None
                order.append(book)
        return order

    def add_book(self, book: Book):
        book_proxy = BookProxy(book)
        index = bisect_left(self._book_titles, book_proxy)
        self._book_titles.insert(index, book_proxy)
        books.insert(index, book)

    def _find_book(self, title: str) -> Union[Book, None]:
        index = bisect_left(self._book_titles, title)
        book_proxy = self._book_titles[index]
        if book_proxy.get_title() == title:
            book = books.pop(index)
            self._book_titles.pop(index)
            return book
        else:
            return None

    def _process_payment(self, order: List[Union[MagicPowderContainer, Book, None]]) -> str:
        price = Decimal("0")
        for item in order:
            if isinstance(item, MagicPowderContainer):
                price += Decimal(item.measure()) * \
                         Decimal(magic_powder_price_per_gram_sign)
            elif isinstance(item, Book):
                price += Decimal(item.price)
        price = price.quantize(Decimal("0.01"), rounding=ROUND_UP)
        return f"{price:.2f}"


@contextmanager
def use_amulet(amulet: TranslationAmulet):
    amulet.activate()
    try:
        yield
    finally:
        amulet.deactivate()


class BookProxy:
    def __init__(self, book: Book):
        self._book = book
        self._title = None

    def get_title(self):
        if not self._title:
            if self._book.language != "english":
                with use_amulet(translation_amulet):
                    self._title = self._book.get_title()
            else:
                self._title = self._book.get_title()
        return self._title

    def __lt__(self, other):
        if isinstance(other, BookProxy):
            return self.get_title() < other.get_title()
        elif isinstance(other, Book):
            return self.get_title() < other.get_title()
        else:
            return self.get_title() < other

Now while this exercise is clearly not grounded in the real world, the issues and solutions do very much apply to real world scenarios. Using a with statement to ensure that translation amulet gets turned off can easily be applied to making sure that an open file gets closed. The issue with the finances and the use of the Decimal class to solve them can clearly be applied to finances in the real world. And minimizing the magic powder usage while speeding up the process of finding the right book could be construed to a real world example of needing to find specific resources, where some of them are local and easy to access, while others would require a network call and would be fairly expensive to make.

Thursday, February 27, 2020

Effective Python Exercise: Wizardly Wiles

Sticking with the D&D Adventure style problems we've seen before, in this exercise for chapter 8 of Effective Python we are a wizard that owns a magic shop called Wizardly Wiles. Being the savvy saleswizard that you are, you've purchased a sales golem to help run the shop and take orders from customers. This sales golem was marketed as being able to serve customers, fill orders, and tally the total price. It is even able to use a translation amulet to help you with your increasing orders from the local Orcish population Sadly, after getting your magic sales golem, you find that there are several inefficiencies in the spells (code) that make it tick. It sometimes won't deactivate the translation amulet, eating up your precious magic powder. It fails to calculate the correct price for small amounts of magic powder. It's slow at searching for the book that the customer wants. So you decide to void the warranty on your sales golem, and make a few modifications to help it run more efficiently in your shop.

What changes could we make to the spells (code)? Note that you're only allow to change the SalesGolem code permanently, but you can experiment with different books on the bookshelf, different customers, different amounts of magic powder, etc, but the optimizations should be applied to the SalesGolem, and not to other code.

from typing import List, Union


class MagicPowderContainer:
    def __init__(self, grams_of_magic_powder: int):
        self.grams_of_magic_powder = grams_of_magic_powder

    def take_magic_powder(self, grams: int) -> int:
        if self.grams_of_magic_powder > grams:
            self.grams_of_magic_powder -= grams
            return grams
        else:
            raise Exception("Not enough magic powder")

    def take_remainder(self) -> int:
        remainder = self.grams_of_magic_powder
        self.grams_of_magic_powder = 0
        return remainder

    def measure(self) -> int:
        return self.grams_of_magic_powder

    def fill(self, grams: int):
        self.grams_of_magic_powder += grams

    def __repr__(self):
        return f"{self.grams_of_magic_powder} grams of magic powder"


class TranslationAmulet:
    # noinspection PyShadowingNames
    def __init__(self, magic_powder_container: MagicPowderContainer):
        self.is_active = False
        self.magic_powder_container = magic_powder_container

    def activate(self):
        self.is_active = True

    def deactivate(self):
        self.is_active = False

    def use(self):
        self.magic_powder_container.take_magic_powder(1)


class Customer:
    def __init__(self, race: str, order: List[str], translated_order: List[str]):
        self.race = race
        self._order = order
        self._translated_order = translated_order

    def get_order(self) -> List[str]:
        if translation_amulet.is_active:
            translation_amulet.use()
            return self._translated_order
        else:
            return self._order


class HumanCustomer(Customer):
    def __init__(self, order: List[str]):
        super().__init__("human", order, order)


class OrcCustomer(Customer):
    def __init__(self, order: List[str]):
        super().__init__("orc", ["*gibberish*" for _ in order], order)


class Book:
    def __init__(self, language: str, title: str, translated_title: str, price: str):
        self.language = language
        self._title = title
        self._translated_title = translated_title
        self.price = price

    def get_title(self) -> str:
        if translation_amulet.is_active:
            translation_amulet.use()
            return self._translated_title
        else:
            return self._title

    def __repr__(self):
        return self._translated_title


class EnglishBook(Book):
    def __init__(self, title: str, price: str):
        super().__init__("english", title, title, price)


class OrcishBook(Book):
    def __init__(self, title: str, price: str):
        super().__init__("orcish", "*gibberish*", title, price)


# noinspection PyMethodMayBeStatic
class SalesGolem:
    def help_customer(self, customer: Customer) -> (str, List[Union[MagicPowderContainer,
                                                                    Book,
                                                                    None]]):
        order_request = self._ask_for_order(customer)
        order = self._fulfill_order(order_request)
        payment = self._process_payment(order)
        return payment, order

    def _ask_for_order(self, customer: Customer) -> List[str]:
        try:
            if customer.race != "human":
                translation_amulet.activate()
                order = customer.get_order()
                translation_amulet.deactivate()
            else:
                order = customer.get_order()
            return order
        except:
            return []

    def _fulfill_order(self, order_request: List[str]) -> List[Union[MagicPowderContainer,
                                                                     Book,
                                                                     None]]:
        order = []
        for item in order_request:
            if item.endswith(" grams of magic powder"):
                ordered_grams = int(item.split(" ")[0])
                try:
                    grams = magic_powder_container.take_magic_powder(ordered_grams)
                except:
                    grams = magic_powder_container.take_remainder()
                order.append(MagicPowderContainer(grams))
            else:
                try:
                    book = self._find_book(item)
                except:
                    book = None
                order.append(book)
        return order

    def _find_book(self, title: str) -> Union[Book, None]:
        for book in books:
            if book.language != "english":
                translation_amulet.activate()
                book_title = book.get_title()
                translation_amulet.deactivate()
            else:
                book_title = book.get_title()
            if title == book_title:
                books.remove(book)
                return book
        return None

    def _process_payment(self, order: List[Union[MagicPowderContainer, Book, None]]) -> str:
        price = 0.
        for item in order:
            if isinstance(item, MagicPowderContainer):
                price += item.measure() * float(magic_powder_price_per_gram_sign)
            elif isinstance(item, Book):
                price += float(item.price)
        return f"{price:.2f}"


magic_powder_container = MagicPowderContainer(100000)
magic_powder_price_per_gram_sign = "0.001"

books = [
    OrcishBook("Adventures of Thorg", "12.57"),
    EnglishBook("Magical Maladies", "9.99"),
    EnglishBook("See the Sites of Middle Earth", "39.99"),
    EnglishBook("See the Sites of Middle Earth", "39.99"),
    OrcishBook("Sorcery 101", "15.47"),
    OrcishBook("Sorcery 101", "15.47"),
    OrcishBook("Sorcery 101", "15.47"),
    EnglishBook("Tales of Treachery", "5.97"),
    EnglishBook("Wand Maintenance", "2.37"),
    OrcishBook("What's the Difference Between a Wizard and a Warlock?", "42.88"),
    EnglishBook("White Wizard", "25.00")
]
# We cheat and access the translated title here without cost so that we can
# make the assumption that the books are sorted, regardless of the order that
# they are specified in the above array
# noinspection PyProtectedMember
books.sort(key=lambda x: x._translated_title)

translation_amulet = TranslationAmulet(magic_powder_container)
golem = SalesGolem()

You can find the answer here.

Wednesday, February 19, 2020

When Presenting, It can be Worthwhile to Add a Fun Twist to the Problem

As a presenter, something that you can do to get more energy and interaction out of your group is to take the topic that you intend to present on, and tie it around a fun concept. A good example of this came from a developer book club that I was facilitating when going over the book Effective Python.

Tyler Brocious was the presenter covering Chapter 5: Classes and Interfaces, and he had the brilliant idea of taking the concepts and tying them in to character classes from D&D, with the exercise being to code up character classes in such a way that we could have classes that were a modification of another class, or a combination of other classes, or a class with a special modification for stealth checks. This led to some entertaining conversations that kept people more engaged with the topic as we talked through classes, inheritance, mixins, and how these things work in Python. It's also worth noting that there was no need for us to stick strictly to how D&D actually works, and we varied away from the standard in order to make or tie in important concepts and points. Here's the code that resulted from the exercise:

import random
import math


class MixinStealthCheck:
    def stealth_check(self):
        roll = random.randint(1, 20)
        if not self.dexterity:
            modifier = 0
        else:
            modifier = math.floor(self.dexterity % 10 / 2)
            if self.dexterity < 10:
                modifier = modifier * -1

        print(f'{self.name} rolled {roll} with modifier {modifier}')
        return roll


class Character:
    def __init__(self, name, strength, dexterity, intelligence, charisma):
        self.name = name
        self.strength = strength
        self.dexterity = dexterity
        self.intelligence = intelligence
        self.charisma = charisma


class Ranger(Character, MixinStealthCheck):
    def __init__(self, name, strength, dexterity, intelligence, charisma):
        super().__init__(name, strength, dexterity+1, intelligence, charisma)


class Warlock(Character, MixinStealthCheck):
    def __init__(self, name, strength, dexterity, intelligence, charisma):
        super().__init__(name, strength, dexterity, intelligence, charisma+1)


class Beastmaster(Ranger):
    def __init__(self, name, strength, dexterity, intelligence, charisma):
        super().__init__(name, strength, dexterity, intelligence, charisma+1)


class Witcher(Warlock, Ranger):
    def __init__(self, name, strength, dexterity, intelligence, charisma):
        super().__init__(name, strength, dexterity, intelligence, charisma)


heroes = [Beastmaster('Bally', strength=12, dexterity=17, intelligence=12, charisma=8),
          Warlock('Tiresias', strength=7, dexterity=6, intelligence=18, charisma=18),
          Witcher('Barron', strength=10, dexterity=10, intelligence=10, charisma=10)]

print()
for hero in heroes:
    print(hero.__dict__)
    hero.stealth_check()
print(Witcher.mro())

Then as an interesting follow up, the presenter for the following week was Rhys Childs covering Chapter 6: Metaclasses and Attributes, and he decided to play off of the previous week's presentation by Tyler, and utilized the same D&D character classes to explore the property and setter annotations, and take a look at one way in which to do descriptors. We looked at the problem such that we started with the previous week's code, and we need to modify it such that a given stat should have a min value and a max value, first exploring doing so for a single stat using annotations, and then exploring a solution that could be better applied to all stats by using a descriptor. Here's the code that resulted from the exercise:

class Stat:
    def __init__(self):
        self.name = None
        self.internal_name = None

    def __set_name__(self, owner, name):
        self.name = name
        self.internal_name = '_' + name

    def __get__(self, instance, instance_type):
        if instance is None:
            return self
        return getattr(instance, self.internal_name, '')

    def __set__(self, instance, value):
            if 6 <= value <= 20:
                setattr(instance, self.internal_name, value)
            elif value <= 6:
                setattr(instance, self.internal_name, 6)
            else:
                setattr(instance, self.internal_name, 20)


class Character:
    strength = Stat()
    dexterity = Stat()
    intelligence = Stat()
    charisma = Stat()
    def __init__(self, name, strength, dexterity, intelligence, charisma):
        self.name = name
        self.strength = strength
        self.dexterity = dexterity
        self.intelligence = intelligence
        self.charisma = charisma

    # @property
    # def intelligence(self):
    #     return self._intelligence
    #
    # @intelligence.setter
    # def intelligence(self, value):
    #     if 6 <= value <= 20:
    #         self._intelligence = value
    #     elif value <= 6:
    #         self._intelligence = 6
    #     else:
    #         self._intelligence = 20


class Ranger(Character):
    def __init__(self, name, strength, dexterity, intelligence, charisma):
        super().__init__(name, strength, dexterity+1, intelligence, charisma)
class Warlock(Character):
    def __init__(self, name, strength, dexterity, intelligence, charisma):
        super().__init__(name, strength, dexterity, intelligence, charisma+1)
class Beastmaster(Ranger):
    def __init__(self, name, strength, dexterity, intelligence, charisma):
        super().__init__(name, strength, dexterity, intelligence, charisma+1)
class Witcher(Warlock, Ranger):
    def __init__(self, name, strength, dexterity, intelligence, charisma):
        super().__init__(name, strength, dexterity, intelligence, charisma)

heroes = [Beastmaster('Bally', strength=12, dexterity=17, intelligence=12, charisma=8),
          Warlock('Tiresias', strength=7, dexterity=6, intelligence=18, charisma=18),
          Witcher('Barron', strength=10, dexterity=10, intelligence=10, charisma=10)]


print()
for hero in heroes:
    print(f'Before: {hero.__dict__}')
    hero.intelligence += 3
    print(f'After: {hero.__dict__}')

Overall, it seemed that these presentations worked quite well, in that they were able to take concepts that were perhaps a little complicated from the reading, and brought them across in a way that was both entertaining and relatable, which helped to solidify an understanding of those principles for a number of people in the group.

Sunday, February 16, 2020

Presenting by Using Examples from the Book

This week in the developer book club that I facilitate, I saw yet another way to give an engaging presentation, one which is fairly effortless and straightforward to set up. The presenter simply took some example code from the book, and pulled it over into an IDE so that we'd be able to play with it a bit, and then during the presentation he spoke a bit to the parts that he understood, and then asked questions about the parts that he didn't understand, which in turn led to useful and meaningful conversations.

I think that this experience helps illustrate some very important points about what the role of the presenter is. In particular, the presenter is not meant to be a teacher, but rather is meant to lead the conversation. Ideally at this point, everyone has read the chapter, and as such none of the material presented should be new to them (though, of course, in real life, there's always some that either weren't able to read it or only got through part of it). As such, teaching the topic is pointless, because everyone already knows it. What is useful at this point is getting people to think critically on the topic.

I have found that, past the initial studying to gain knowledge, there are three things that promote critical thinking on a topic: questions, experimentation, and conversation. This is why, in the initial article where I describe the presenter role, I prescribed coming up with a practice problem, and having the group tackle the problem via mob programming during the meetup. The practice problem is a question, in particular, it's a question that is easy to experiment with, and the mob programming setup is an environment that allows experimentation as a group, which in turn promotes conversation.

With that being said, if you, as the presenter (or anyone else in the group, for that matter), have legitimate questions about the topic that you don't have answers to, then those questions are frequently more worthwhile than a made up question, especially if there's a good way to experiment with and explore the question, so as to find and better understand the answer.

Thursday, January 23, 2020

Another Approach to Building Exercise Problems

Here recently I've been working with those in my developer book club group to get some of the others to be presenters, and with that we've gone through and covered chapters 2 and 3 of Effective Python with others from the group acting as presenters, and the two presenters took a different approach to coming up with the exercise problems than I did, and it worked pretty well, so I figured that I'd go over their approach a little bit here.

To start off, here's a small sampling of some of the exercise problems and their answers for the Effective Python chapter 2 presentation, as presented by Nathan Malloch:

from typing import List
from collections import defaultdict

print('create a list with 10 elements')
l = [4, 65, 34, 56, 78, 23, 54, 76, 653, 66]
print(l)

print('print the last 3')
print(l[-3:])

print('print all but the last 5')
print(l[:-5])

print('print the first 4')
print(l[:4])

print('make a referenced copy of the list and print them')
g = l
print(g)
print(l)

print('remove the last two indexes from the first list while also updating the second')
l[:] = l[:-2]
print(g)
print(l)

print('make a deep copy of the list, print both')
b = l[:]
print(b)
print(l)

print('remove 2 through 4 index of first list, but don\'t effect the second list')
l[2:5] = []
print(l)
print(b)

So here we see an approach where instead of giving a single, high level problem to solve, we are given problems that are much closer to the code and are very clearly tied to the principles taught in chapter 2.

I find that this approach and the high level approach can both work well in helping to emphasize and teach different things. This low level approach allows you to focus in on a specific language feature and test it in many different ways so as to get an in depth understanding of how that specific feature works, whereas in the high level approach, you aren't told specifically what tools or features to use, and choosing which features to use is part of the problem solving process.

This makes me think of an analogy in math. In your typical math textbook you will run into two types of problems. The first type are your standard problems, which will typically look something like this:

    6
   x3

And then there's the story problem, which looks a bit more like this:

A box of donuts has 6 donuts, and you have 3 boxes. How many donuts do you have?

Now with the first type of problem, solving is pretty straightforward. You're expected to practice your multiplication, like so:

    6
   x3 
   18

The second type of problem also involves setting up the problem, and the best setup would turn out the same:

    6
   x3 
   18

But that being said, it's quite valid for someone to choose to solve the problem in this way, even if it isn't quite ideal:

So with that, a story problem gives the opportunity to exercise your ability to choose the right tool for the job. On the flip side of the coin, though, standard practice problems give the ability to explore important concepts that don't easily fit into an nice little story, such as:

(2+i)(3-4i)=

These same concepts apply to high level and low level exercise problems for programming, where high level also exercises tool selection, and low level allows experimentation with principles that would be hard to fit into a concise story problem.

There is also another benefit to low level exercise problems for the presenter, which is that they are easier to come up with. There's no need to come up with a story, and no need to try and fit unrelated features into the same problem.

One word of caution when using this style, though. Because of how small and simple the problems are, it can be tempting to prepare more material than can be covered in a single meetup. Just as with the story problems, it's important that you don't try to teach everything from the chapter, but instead pick out a few of the more important principles to practice.

And it's important to expect that as you're presenting the problem, that people in the group will want to test and experiment with each of the problems along the way, seeing what happens with one tweak or another. It is important that there is time enough for this experimentation, because that's what's going to help it to stick in people's minds.

So with that, feel free to mix it up between high level and low level exercise problems. They both make for great ways to help solidify the ideas being taught.

Monday, January 13, 2020

Caveat for Optimizing PostgreSQL Migrations

In last week's post I wrote about minimizing ACCESS EXCLUSIVE lock time on a table when writing a migration, and gave the example of taking a migration that looked like this:

alter table people
add column if not exists guid varchar(50) default uuid_generate_v4() not null;

create index if not exists people_guid_index
    on people using btree(guid);

And changing it to this:

alter table people
add column if not exists guid varchar(50);

alter table people
alter column guid set default uuid_generate_v4();

update people set guid = uuid_generate_v4()
where guid is null;

alter table people
add constraint temp_null_check check ( guid is not null ) not valid;

alter table people
validate constraint temp_null_check;

alter table people
alter column guid set not null;

alter table people
drop constraint temp_null_check;

create index concurrently if not exists people_guid_index
    on people using btree(guid);

Because on our example data set the first version caused the people table to have 1 m 6 s 455 ms of ACCESS EXCLUSIVE lock time and 25 s 751 ms of SHARE lock time during the migration, which meant that for over a minute, no select, update, insert, or delete queries could be run against the people table, and then for nearly another 30 seconds following that select queries could run, but update, insert, and delete queries still could not. Alternatively, the second version resulted in 35 ms of ACCESS EXCLUSIVE lock time, 41 s 450 ms of SHARE UPDATE EXCLUSIVE lock time, and 1 m 46 s 636 ms of ROW EXCLUSIVE lock time, which meant that your select, update, insert, and delete queries would only be blocked from running for a total of 35 milliseconds.

Now one important caveat that I'd like to mention here is that this only matters if it's important to still be able to run queries against the table during the migration. This is typically important if you have a zero-downtime deployment pipeline in place. But I have worked at places where the deployment pipeline would intentionally take the servers down before running the migrations, and then bring them back up afterwards. Now, is this an ideal pipeline? No, not really. But many times in the real world we need to work in less than ideal situations. While it would be nice to fix the pipeline in such a situation, it is likely to take a lot of time and effort that may or may not be able to spare at the current time, and it is unrealistic to expect all other work to halt until the pipeline is addressed.

With that being said, which of the two versions of the migration given above would be better in the situation with the pipeline described? The first version. Why? Because at this point we don't care about the lock times. What we care about is how long the server will be down, and the first version's total run time is 1 m 32 s 206 ms, whereas the second version's total run time is 2 m 28 s 121 ms. As such, the second version would cause the downtime to be nearly a minute longer.

It is always important when optimizing to know which metrics are important to optimize to.

Tuesday, January 7, 2020

Minimize Access Exclusive Lock Time When Running a Migration in PostgreSQL

I recently ran into an issue where a data migration written for a PostgreSQL caused some issues because it locked up a table for a rather long period of time, preventing any other queries from being run against that table until the migration was finished. Because the table in question was one that was heavily relied on for core parts of the application, it caused a short but extensive outage.

Now, with the proper foresight, this situation can be avoided, but it requires that a developer be mindful of the type of lock any given query will cause, and for how long. As an example, let's create a similar situation. Let's say that we have the following table called people:

create table people (
  id serial primary key,
  first_name text,
  last_name text
);

And we want to create a guid column, because in the future we want to start moving away from using a sequence for our ids and instead move to using guids. To do this, it would seem to be straightforward to write this migration by building a simple alter table query, like this:

alter table people
add column if not exists guid varchar(50) default uuid_generate_v4() not null;

It seems simple enough. What could go wrong? A lot, as it turns out. And the more data you have in your people table, and the more that data is used, the worse the problems with this query become. To better understand this problem, we need to dive in to the Postgres documentation a bit. In particular we need to understand Table-Level Lock Modes. We can find documentation on this at https://www.postgresql.org/docs/12/explicit-locking.html.

According to the documentation, there are eight different table-level locks, moving from least restrictive to most restrictive in this order: ACCESS SHARE, ROW SHARE, ROW EXCLUSIVE, SHARE UPDATE EXCLUSIVE, SHARE, SHARE ROW EXCLUSIVE, EXCLUSIVE, and ACCESS EXCLUSIVE. An alter table query can have one of three different locks: SHARE UPDATE EXCLUSIVE, SHARE ROW EXCLUSIVE, or ACCESS EXCLUSIVE. Now can potentially have one of multiple different locks, it's always best to assume it's the most restrictive lock unless you've explicitly looked at the documentation and know that it is otherwise, and that rule would hold up in the case of the query above, because it does acquire an ACCESS EXCLUSIVE lock.

Now the important thing to understand about locks is what other lock types they conflict with. This in turn tells you what kind of queries will be locked out from accessing the table while the given query is running. In the case of ACCESS EXCLUSIVE, it conflicts with all other locks. Now this in and of itself isn't bad. There are a lot of important things that you wouldn't be able to reliably do for your database without the use of this kind of lock. Where it becomes bad is when a query that needs an ACCESS EXCLUSIVE lock is also a long running query, with the biggest issue caused by this being that it will block all of your standard CRUD operations, which includes all select, insert, update, or delete queries that need access to the table that has been locked.

So how does this apply to the example above? Well, let's find out. In order to test this, we'll want to have a large amount of data in our table. So after creating our table by running this:

create table people (
  id serial primary key,
  first_name text,
  last_name text
);

We can put a little bit of data into the table with the following query:

insert into people (first_name, last_name) values
  ('John', 'Doe'),
  ('Jane', 'Doe'),
  ('Bob', 'Smith'),
  ('Jill', 'Hill'),
  ('Jack', 'Hill');

And then we can turn that into a lot of data by running this query a number of times:

insert into people (first_name, last_name)
  select first_name, last_name
  from people;

Which essentially takes all the data in the table, copies it, and reinserts into the table, effectively doubling the size of the table each time it is run. By running it 20 times, your table will have over 5 million users, which should be enough to give us a clear idea of what's going on. At this point if I run the alter table query from above, it'll take 1 m 6 s 455 ms to complete (Note that your times will likely vary). So that is over a minute where the table is locked with an ACCESS EXCLUSIVE lock, meaning that no other queries can be run against it. This is a problem. Ideally any query that needs an ACCESS EXCLUSIVE lock should be running in the millisecond range, or at worst upwards of a couple of seconds. That's definitely not the case here. So let's figure out how to fix it. Let's start by reverting our table to its pre migration state by running the following query:

alter table people
drop column guid;

Now, the part of the query that is taking so long to run is the part that is trying to generate a new guid for all of the 5 million plus rows that already exist in our database. There isn't really a reason why we need to generate all of those guids as part of the alter table query, and that part of the migration could ultimately be moved out into an update query, which according to the documentation only needs a ROW EXCLUSIVE lock, which is much more permissive and will allow other select, insert, update, and delete queries to run while it is runnning.

So let's start out with modifying our alter table query to remove the default value for the time being. Note that by removing the default value we are also required to remove the not null constraint.

alter table people
add column if not exists guid varchar(50);

This pared down query runs in 14 ms and will give you a resulting table that looks something like this:

id	first_name	last_name	guid
1	'John'	'Doe'	NULL
2	'Jane'	'Doe'	NULL
3	'Bob'	'Smith'	NULL
4	'Jill'	'Hill'	NULL
5	'Jack'	'Hill'	NULL
...	...	...	...

At this point, we can add on our default value.

alter table people
alter column guid set default uuid_generate_v4();

This query runs in 4 ms, which is good, because this query also requires an ACCESS EXCLUSIVE lock. At this point our total ACCESS EXCLUSIVE lock time is 18 ms. The reason why this query is so quick is because it doesn't set the guid value for the 5 million plus existing rows, but instead ensures that newly inserted rows will have a guid. So for instance if at this point I ran the query:

insert into people (first_name, last_name) values
  ('Susie', 'Queue');

I'd end up with a table that looks something like this:

id	first_name	last_name	guid
1	'John'	'Doe'	NULL
2	'Jane'	'Doe'	NULL
3	'Bob'	'Smith'	NULL
4	'Jill'	'Hill'	NULL
5	'Jack'	'Hill'	NULL
...	...	...	...
5242881	'Susie'	'Queue'	'60f029c8-3818-43ba-bb0a-9a483a4c5529'

Now we can run an update query to add guids to all existing rows:

update people set guid = uuid_generate_v4()
where guid is null;

Note the conditional 'where guid is null'. This prevents us from overriding the guid values of any newly inserted rows, such as the Susie Queue row in the example above. Running this query takes 1 m 46 s 636 ms. Note that this takes longer than our original alter table query, but that's ok, because this is 1 m 46 s 636 ms of Row EXCLUSIVE lock time, which is a much more permissive lock that will still allow for select, insert, update, and delete queries to run. So at this point we have 18 ms of ACCESS EXCLUSIVE lock time, and 1 m 46 s 636 ms of ROW EXCLUSIVE lock time, with a table that looks like this:

id	first_name	last_name	guid
1	'John'	'Doe'	'cf8cd62a-5849-491a-a0b6-fd96dddb9562'
2	'Jane'	'Doe'	'ea9722ae-f4af-4d52-bea9-fd38691d37a7'
3	'Bob'	'Smith'	'a75d722c-217d-4a02-bb94-f4246fcf1a5c'
4	'Jill'	'Hill'	'ab09e9ca-75b9-4bf1-a3cf-b34cc8fe6965'
5	'Jack'	'Hill'	'14c31156-c57e-4250-93d1-110ba78d2b5d'
...	...	...	...
5242881	'Susie'	'Queue'	'60f029c8-3818-43ba-bb0a-9a483a4c5529'

But we're still not finished yet. We still need to add on our not null constraint. We can do this with the following query:

alter table people
alter column guid set not null;

This query took 6 s 290 ms to run, and requires an ACCESS EXCLUSIVE lock. This would bring our total ACCESS EXCLUSIVE lock time up to 6 s 308 ms. In many situations this is probably acceptable, and you could stop here. But with that being said, 6 seconds is still a long time to lock down all access to a table, especially if that table is critical to your application. So can we improve this and get the time down further? We can, but it requires diving deeper into the alter table documentation, which can be found at https://www.postgresql.org/docs/12/sql-altertable.html.

In the documentation for set not null, it says: "SET NOT NULL may only be applied to a column providing none of the records in the table contain a NULL value for the column. Ordinarily this is checked during the ALTER TABLE by scanning the entire table". This gives us the explanation of why the query takes so long. After it locks the table, it has to go through every single row and verify that there are no nulls before it can apply the constraint. Why does it need to do this? Simple, it's because while at this point you and I know that there are no nulls in any of the rows, Postgres has no way of knowing this. This is because at this current point it would still be completely valid to run a query such as:

insert into people (first_name, last_name, guid) values
  ('Joe', 'Johnson', null);

Or like:

update people set guid = null
where id = 1

While you and I know that no such query has been run, Postgres has no way of knowing that for certain, hence the need to check every single row to verify that there are no nulls. But in the very next line of the documentation it gives us a way in which this very expensive scan can be skipped. It says: "however, if a valid CHECK constraint is found which proves no NULL can exist, then the table scan is skipped." So at this point we'll want to take a closer look at the documentation for the check constraint.

As we look over the documentation for the check constraint, we know that it will also require an ACCESS EXCLUSIVE lock, because there is no explicit documentation stating otherwise, and at the beginning of the alter table documentation it states that "[a]n ACCESS EXCLUSIVE lock is held unless explicitly noted." So with that in mind, let's take a closer look at what the documentation has to say about how the check constraint works. It states: "Normally, this form will cause a scan of the table to verify that all existing rows in the table satisfy the new constraint. But if the NOT VALID option is used, this potentially-lengthy scan is skipped." Furthermore, it states that: "The constraint will still be enforced against subsequent inserts or updates (that is [...] they'll fail unless the new row matches the specified check condition). But the database will not assume that the constraint holds for all rows in the table, until it is validated by using the VALIDATE CONSTRAINT option."

From this we can assume that adding a check constraint will result in the same lengthy ACCESS EXCLUSIVE lock time as the not null constraint from before, unless we use the NOT VALID option. By doing so, the constraint will only be enforced for new changes to the data, and not any of the current data. This constraint can then be validated for the rest of the existing data later via the use of the VALIDATE CONSTRAINT option. So with this in mind, let's go take a closer look at the VALIDATE CONSTRAINT documentation.

The VALIDATE CONSTRAINT section points us to the notes section, and the notes section tells us the following: "[A] VALIDATE CONSTRAINT command can be issued to verify that existing rows satisfy the constraint. The validation step does not need to lock out concurrent updates, since it knows that other transactions will be enforcing the constraint for rows that they insert or update; only pre-existing rows need to be checked. Hence, validation acquires only a SHARE UPDATE EXCLUSIVE lock on the table being altered."

Going back over to the documentation on locks, we can that the SHARE UPDATE EXCLUSIVE lock only conflicts with locks at the SHARE UPDATE EXCLUSIVE level or above. This means that it won't prevent our select, update, insert, or delete queries. As such, we should be able to add a check constraint with the NOT VALID option, then validate that constraint, followed by adding the not null constraint, and then finally dropping the check constraint. Let's do that.

alter table people
add constraint temp_null_check check ( guid is not null ) not valid;

This query took 9 ms to run, and required an ACCESS EXCLUSIVE lock.

alter table people
validate constraint temp_null_check;

This query took 6 s 748 ms to run, and required a SHARE UPDATE EXCLUSIVE lock.

alter table people
alter column guid set not null;

This query took 4 ms to run, and required an ACCESS EXCLUSIVE lock.

alter table people
drop constraint temp_null_check;

And this final query took 4 ms to run, and required an ACCESS EXCLUSIVE lock.

At this point our entire migration from start to finish requires 35 ms of ACCESS EXCLUSIVE lock time, 6 s 748 ms of SHARE UPDATE EXCLUSIVE lock time, and 1 m 46 s 636 ms of ROW EXCLUSIVE lock time, resulting in a total migration time of 1 m 53 s 419 ms, of which time, only 35 ms caused CRUD queries to be blocked.

As an additional note, as part of a migration such as this, it's very likely that you'd also want to create an index on the newly created guid column, with the original migration looking something like this:

alter table people
add column if not exists guid varchar(50) default uuid_generate_v4() not null;

create index if not exists people_guid_index
    on people using btree(guid);

This create index call is also problematic, in that it takes 25 s 751 ms to run on our example problem, and it requires a SHARE lock. A SHARE lock conflicts with a ROW EXCLUSIVE lock, so while this would allow select queries to be made, it would block insert, update, and delete queries. Luckily, this is easily fixed. Simply add the keyword concurrently, like so:

create index concurrently if not exists people_guid_index
    on people using btree(guid);

When the keyword concurrently is added, it only requires a SHARE UPDATE EXCLUSIVE lock, which will in turn allow for insert, update, and delete queries to run. Note that adding concurrently will make the query take longer to run, with this taking 34 s 702 ms to run on our example, but again, this is preferred, because it doesn't block anything. So with that, that would bring our final migration to look something like this:

alter table people
add column if not exists guid varchar(50);

alter table people
alter column guid set default uuid_generate_v4();

update people set guid = uuid_generate_v4()
where guid is null;

alter table people
add constraint temp_null_check check ( guid is not null ) not valid;

alter table people
validate constraint temp_null_check;

alter table people
alter column guid set not null;

alter table people
drop constraint temp_null_check;

create index concurrently if not exists people_guid_index
    on people using btree(guid);

With it requiring 35 ms of ACCESS EXCLUSIVE lock time, 41 s 450 ms of SHARE UPDATE EXCLUSIVE lock time, and 1 m 46 s 636 ms of ROW EXCLUSIVE lock time, with a total time of 2 m 28 s 121 ms, with only 35 ms of that blocking CRUD operations. And that is how you fix a migration to minimize lock times and keep everything running smoothly. The important lesson here is to always be aware of your lock times when writing migrations around heavily used and sensitive tables.