TIC TAC TOE: 2016

image source: wikipedia

This article is for demonstration purposes only.

TREES

A tree is a widely used abstract data type or data structure which simulates a hierarchical tree structure, with a root value and subtrees of children with a parent node, represented as a set of linked nodes or as an array.

This article is inspired by google' AlphaGo and IBM's DeepBlue and focuses on the applicability of trees in games and similar AI fields. An attempt has been made to provide general optimization techniques which can be used to reduce complexity of the program.
We shall focus on TicTacToe as an example. As there are only 255,168 possible games, it is not that hard to use a simulating AI system.

A game tree is a type of recursive search function that examines all possible moves of a strategy game, and their results, in an attempt to ascertain the optimal move. They are very useful for AI in scenarios that require static decision making and have a relatively low number of possible choices per play. The most commonly-cited example is chess, go, TicTacToe.

Limitations of game trees:

1. Game trees are rarely used in real-time scenarios.
2. Games with abstract rules and uncertainty do not mix well with game trees.

General Concept:

Game trees are generally used in board games to determine the best possible move. For this purpose we are using TicTacToe as an example. The idea is to start at the current board position, and check all the possible moves the computer can make. Then, from each of those possible moves, to look at what moves the opponent may make. Then to look back at the computer's. Ideally, the computer will flip back and forth, making moves for itself and its opponent, until the game's completion. Recursion is very effective when used in this scenario.

This article also goes through Min-Max Algorithm to devise a method where computation time is heavily reduced.

image source: wikipedia

Summary of What the Computer Does

The computer begins evaluating a move by calculating all possible moves. In Tic-Tac-Toe, it is filling any empty square on the grid, thus the number of possible moves will be equal to the number of empty squares. Once it has found these moves, it loops over each of the possible moves, and tries to find out whether the move will result in a win for the computer or not. It does this by running this same algorithm (recursion) over the position (obtained by performing one of the computer's original possible moves) but trying to calculate the opponent's best move.

Scoring the state as an absolute value (MIN-MAX algo)

To find if a move is "good" or "bad" the game tree is extended and continues to branch out until the game ends (a "terminal node"). Terminal nodes are then assigned values based on the result of the game; the higher the value, the better the result is for the computer. Because there are only two possible results (a win or a loss), the values of "-1000" and "1000" can be used to represent who has won. Now that the terminal nodes' values have been determined, the node above them, If the node represents the computer's choice of moves it is a "max node", if it represents the player's choice of moves it is a "min" node.

The value of a max node becomes that of the highest node beneath it. The value of a min node becomes that of the lowest node beneath it. By continuing to move up, each node is given a value.Hence, the top node then assumes the highest value of the nodes beneath it, which represents the ideal choice that the computer should make. It is similar to Depth First Search.

Optimization

Searching to a certain look ahead

Though searching every move is possible, it is faster to just think forwards through the game about 3-6 moves ahead. 6 moves are often enough to determine how good or bad a move may be, and does not sacrifice speed too much.
Humans play a game similarly. We do not calculate through the entire game for each move; we always think through the consequences of a given move 3-6 moves in the future, and decide based on them.
To put this into perspective, if the average chess position has 20 moves (underestimate), and you want to search a game to completion, assuming an average of 40 moves per game (another underestimate), you'll have to search 20^40 positions (about 10^52). Even modern computers with incredibly powerful languages are incapable of searching anywhere near this scale. But computers are easily capable of searching 4 or 5 moves deep, and looking at the position to see who is winning, and this is what they do. Instead of searching 40+ moves deep, they search 2-8 moves deep and evaluate the end positions.

Pruning

In the whole game tree, there are many nodes that represent the same state, and are only different in there symmetry. If all these nodes are converted into a single node capable of defining the current state of the game, the space complexity can be heavily reduced.
Using this technique, the original size of tree ie: 9! can be reduced to under 19,683 and further reduced to under 800, if done efficiently.
For Evaluation, we can devise a method which analysis the game state with the current board positions.

Problems With these Optimizations

The horizon effect is when the computer makes a poor decision because of a move that was too deep for it to search (if it is searching 5 ply and the bad consequence of a move is 6 ply deep, the computer will not see it). This can be problematic when the computer takes a piece in chess (and does not see a recapture) or when the computer is threatened and the computer keeps trying to delay the inevitable, often resulting in poor decisions by the computer.

Pruning in itself is a very complex operation. Computers will find it difficult to compare more advanced games like chess and go. Advanced board positions will not have symmetrical duplicates and therefore cannot be converted into a fixed state. Hence, It is easy to prune the nodes at the starting of the game, but as it goes along, searching through every move will be challenging in a pruned decision tree.

Computers have serious problems playing any game with a large number of moves per turn (called the "branching factor" of a game tree). Chess has a branching factor of about 30 (an average of 30 moves per turn). Other games (Go, for instance) have a very large number of possible moves, and, as a result, computers are unable to search every move to enough depth to make a reasonable move.
Novices regularly beat computers at Go, because computers are unable to cope with the large number of moves, as the computer has to search a number of positions equal to the following expression:
Number of positions = (branching_factor)^depth_searched.

In chess, it is about 30^depth, and in Go, it is insane!

Implementation

Imagine that at any point in a tic-tac-toe board, every single possible move is a branch. The current state of the board is the root. One move is a branch. Now pretend (one at a time), that each branch becomes the current state. Each possible move becomes a new branch. The leaf of the tree is when the last move is made and the board is full.

This technique is useful when you have to store the whole look up table in memory for the best possible move.

But by using a recursive method, we can do the same at run-time.

int score(char board[3][3], char player1, char player2)

The call-stack would over time look like a tree if you drew it on paper. You should return the move that leads to a node at which the opponent (definitely / most likely) looses.

Now we come to scoring.
How do you decide which move is optimal? We score the move that is most likely to give us a win a max value and the one that is loosing is given a min value.

A contrived end state for a tic tac toe game.

A contrived end state for a tic tac toe game.

image source

A move tree from the perspective of the other player, O

image source

Here is the code snippet for finding the best possible move:

int score(char board[3][3], char p1, char p2)
{
    int bestmove = -9999;
    int movescore = 0;
    if (check(board)==p1)
        return 1000;
    else if (check(board)==p2)
        return -1000;
    for (int r = 0;r<3;r++)
        for (int c = 0;c<3;c++)
            if (board[r][c]=='.')
            {
                board[r][c] = p1;
                movescore = -(score(board, p2, p1));
                board[r][c] = '.';
                if (movescore>=bestmove)
                                bestmove = movescore;
            }
    if (bestmove==-9999 || bestmove==0)
        return 0;
    if (bestmove<0)
        return ++bestmove; //value at deeper levels are less favourable
    if (bestmove>0)
        return --bestmove;
}

int pickmove(char board[3][3], char p1, char p2)
{
    int bestmove = -9999;
    int bestrow = -9999;
    int bestcol = -9999;
    int movescore = 0;
    for (int r = 0;r<3;r++)
        for (int c = 0;c<3;c++)
            if (board[r][c]=='.')
            {
                board[r][c] = p1;
                movescore = -(score(board, p2, p1));
                board[r][c] = '.';
                if (movescore>=bestmove)
                {
                    bestmove = movescore;
                    bestrow = r;
                    bestcol = c;
                }
   }
    return (10*bestrow+bestcol);
}

Where check(char board[3][3]) returns 'x' or 'o' for the player that has won in the current board position and '.' represents an empty position on the board.
The value returned by pickmove will be row at ten's place and column at one's place.

Each time score is recursively called, player1 and player2 are interchanged so that other player's move can also be determined using the same method. Hence, each time it is called it's multiplied to a negative, as the value of score for both of the players are inverted.

Using the functions given above, we can make the entire TicTacToe program.
It is highly recommended to make these functions yourself and these have only been provided for reference.

TIC TAC TOE

Thursday, 28 July 2016

Using Game Trees in TicTacToe