Texas Hold'em (notes)

This is a neural network designed to play Texas Hold'em poker. I created this using a keras model in the sequential API. It is a very basic model that simply takes the last few round states and returns a trinary output - fold, call, or raise. I took on this project because it is in essence, the same as Rock, Paper, Scissors; a psychological game with a three choice output. Of course, Texas Hold'em involves cards and other players bets - but those can be simplified to single values, which is how my network functions. I represent each player as their stack, or the amount of chips they have remaining. The change in this value over the course of a few rounds of betting indicates their exact raise values without the need of inserting their actual move statuses. On top of that, the cards in the networks' hand and on the table can be reduced to two values. The first value indicates the networks' current hand strength, calculated by the built-in functions provided by PyPokerEngine. The other value represents all the possible hands that could be created given the cards on the table - so if three cards of the same suit appear, the network will know that a flush could be created with what is on the table. While this is a gross and imperfect simplification of the game, it is a "good enough" simplification to allow the network to learn and improve.

PyPokerEngine is a library on github that provides a Texas Hold'em environment in python that is relatively easy to use. It also has the added benefit of having a GUI built in, so I can play against my network once training is complete.

My network is designed to play in games with six players in Texas Hold'em. It takes an array of five lists. Each list represents the round state of that round of betting, and five rounds of betting makes one full hand. Each list is of length eight: the first six values are the stacks of each player, and the last two values are the networks' hand rank and the strongest hands that can be made with the cards on the table. This is a total of 40 inputs, and it outputs one value along a Leaky ReLU curve. The leaky rectified linear unit curve was chosen because it can be easily mapped to poker values: any value below 0 is a fold, any value between 0 and the minimum bet is a call, and any value above the minimum bet is a raise of that amount exactly.

The reason Leaky ReLU was chosen over ReLU is due in part to limitations of the keras API being used, but mostly to my custom loss function: the absolute value of my network's moves minus the amount won. Since my network attempts to minimize the output of my loss function, I realized that choosing standard ReLU may cause my network to always output 0, folding every time. This lack of play is not what I want to encourage, so I switched to Leaky ReLU, causing it to prefer calls over folds: the absolute value of a negative number is positive, meaning that a strong fold is discouraged compared to a passive call. While this isn't the most profitable poker strategy, I had decided I would rather a network that plays poker somewhat poorly, than one that doesn't play at all.

The Keras Sequential API does not allow me to pass a sum of past predictions into my custom loss function, so I must switch to the Functional API if I wish to continue on this project.

Page updated

Google Sites

Report abuse