Riley Holterhus | Optimizor

IAmTheOptimizor Challenge

2022-10-13 edit: This solution is no longer the best solution - the current best was created by Matt Gleason and Noah Citron of a16z

On the afternoon of September 10th, 2022, 0xBeans announced the launch of his "king-of-the-hill optimizor challenge". The idea of the challenge is to implement a gas optimized solution to a variant of the classic 3-sum problem. The challenge has attracted many highly-skilled competitors, and the high-score has been beaten 15 times so far. The current best solution was created by me, and was submitted on-chain with the help of @rage_pit and @samczsun. The idea of this solution is described below.

Implementation Details

If you take a look at the code for the challenge you will notice the following:

To submit a solution, competitors call becomeTheOptimizor(address player), where player is the address of their implementation contract.

The contract uses a seed value to create a random instance of the 3-sum problem. The size variable here refers to the bytecode size of the solution we are submitting.

To help prevent front-running of solutions, the player contract must return msg.sender when owner() is called on it.

To check the correctness of a submitted solution, the challenge contract will call solve(inputArr, desiredSum) on the player contract. This call should return an array of 3 distinct indices, where the sum of the elements in inputArr at these indices equals desiredSum.

The inputArr will always have 10 elements, all between 0 and 49. This means the correct indices to return will always be three numbers between 0 and 9.

The above details imply that at the very least, the player bytecode must return msg.sender (your address) when called with owner(), and return three distinct numbers between 0 and 9 when called with solve(inputArr, desiredSum). Many competitors realized that it would be very efficient to just hardcode these return values and wait for a block that aligns with the hardcoded values. Even though the meta had progressed past the original 3-sum problem, optimizing the hardcoding was still a very interesting problem and many people were still hooked.

An example solution

To see how elaborate the solutions are, consider this bytecode, which was an optimization of a solution by @wireless_anon:

This bytecode stores 8 in the second word of memory, and always leaves the third word uninitialized (meaning the third word will be 0, the default value for uninitialized memory). Lastly, this bytecode stores (tx.origin << calldata[0x0f:0x0f+0x20]) | 0x01 in the first word of memory. The idea is:

When owner() is called on this bytecode, the calldata will be <owner_4byte_sig>0000000..., where the zeros extend infinitely (since zero is the default value for unset calldata). This means tx.origin will be shifted left by 0 bits (a no-op), and tx.origin | 0x01 will be stored in the first word of the memory. As long as the tx.origin address has its least significant bit set to 1, the OR operation will be a no-op and this will pass the challenge's owner() check.

When solve(inputArr, desiredSum) is called on this bytecode, the calldata will be <solve_4byte_sig><inputArr[0]><inputArr[1]>..., and since calldata[0x0f:0x0f+0x20] is not word-aligned, calldata[0x0f:0x0f+0x20] will be a relatively large value. This means that tx.origin will be shifted left by more than 256 bits (zeroing out the value completely), and thus 0x01 will be stored in the first word of memory and [1, 0, 8] is returned. A competitor can wait until an upcoming block has a seed that leads to this being a correct answer (this should happen with probability 1/120), and then submit in that block to solve the challenge.

My Solution

Up until I submitted my most recent solution, the #1 spot had been ripped away from me 4 different times. I was deep into what many would consider a "nerd-snipe", and I couldn't stop thinking about the problem. After a few days, I eventually came up with the following bytecode:

This bytecode will first store tx.origin in the area specified by calldata[0x04:0x04+0x20] (aka inputArr[0]). Then, it will store 0x01 in the third word of the memory (i.e. in [0x40,0x60)). If we think about the two cases again:

When owner() is called on this bytecode, calldata[0x04:0x04+0x20] will be zero, so tx.origin will be stored in the first word of the memory. This passes the owner() check.

Now, when solve(inputArr, desiredSum) is called, suppose that inputArr[0] = 49 = 0x31. When we store tx.origin in this location, we be will storing the first 0x40 - 0x31 = 15 bytes of tx.origin in the second word of the memory, and the rest will overflow into the third word. Since the first 12 bytes of addresses are always zero (addresses are 20 byte values), this means that only the first 3 bytes of the tx.origin value will be in the second word. We can easily grind out an address where the first 3 bytes equals a number between 2 and 9, making the second word in memory a valid distinct index. Since the MSTORE for 0x01 happens after all of this, the overflowed part of tx.origin will be clobbered by the 0x01 value. This means that the returned indices will be [0, tx.origin[0x00:0x06], 1], which is potentially a correct answer.

To get this solution on-chain, we can do some extra work up-front to increase our chances of the correct randomness happening. Instead of just grinding out one tx.origin address, we can grind out a corresponding address for each number between 2 and 9. Also, our solution would work just as well if inputArr[0] was 48 or 47 and our tx.origin address had one or two more leading zero bytes (inputArr[0] < 47 could also work, but the corresponding addresses had too many leading zeros for us to grind out relatively quickly). After we computed the private keys for these addresses, we had a script watch for upcoming block proposers registered with mev-boost. These blocks will have predictable block.timestamp, block.number and block.coinbase values (the coinbase would likely be the flashbots builder address). This means that our script could anticipate which upcoming blocks would be valid, and which tx.origin would be needed to initiate the becomeTheOptimizor(...) transaction. After our script recognized block number 15716477 had a seed that lead to inputArr[0] = 48 with expected indices [0, 5, 1], it built a flashbots bundle using address 0x00000005b9FDaCC947710E88B42522a5Ab5a7B00. The bundle successfully landed, giving us the new high-score.

⇦ Writing

October 12, 2022