
SciBet provides bettors with links and abstracts of scientific publications dealing with sports predictions. The publications are available online, and divided into five different categories that cover most of the prediction methods on the market. The categories are as follows: Overview of Regression Models, Poisson Regression, Skellam Regression, Recursive Bayesian Estimation and Kernel Regression.
| Overview of Regression Models |
|---|
1. Kenneth Massey (1997) - Statistical Models Applied to the Rating of Sports Teams
Abstract: One of the most intriguing aspects of sports is that it thrives on controversy. Fans, the media, and even players continually argue the issue of which team is best, a question that can ultimately be resolved only by playing the game. Or can it? It is likely that a significant portion of sporting results could be regarded as flukes. Simply put, the superior team does not always win. Therefore even a playoff, although it may determine a champion, will not necessarily end all disagreement as to which team is actually the best.
2. Emonet Benoit (2000) - Revisiting Statistical Applications in Soccer
Abstract: The present report results from a project taking part of the `Science, Technique and Society` cursus. These projects have to be done during the undergraduate studies in the Department of Mathematics at the Swiss Federal Institute of Technology (EPFL). Our main goal was to review the statistical work related to soccer throughout articles published in statistical journals; see the references for an extensive list. To do so we merged them together in order to give an overall view of the different investigations.
| Poisson Regression |
|---|
1. Mark J. Dixon, Stuart G. Coles (1996) - Modelling Association Football Scores
Abstract: A parametric model is developed and fitted to English league and cup football data from 1992 to 1995. The model is motivated by an aim to exploit potential inefficiencies in the association football betting market, and this is examined using bookmakers' odds from 1995 to 1996. The technique is based on a Poisson regression model but is complicated by the data structure and the dynamic nature of teams' performances. Maximum likelihood estimates are shown to be computationally obtainable, and the model is shown to have a positive return when used as the basis of a betting strategy.
2. Dimitris Karlis, Ioannis Ntzoufras (2003) - Analysis of Sports Data Using Bivariate Poisson Models
Abstract: Models based on the bivariate Poisson distribution are used for modelling sports data. Independent Poisson distributions are usually adopted to model the number of goals of two competing teams. We replace the independence assumption by considering a bivariate Poisson model and its extensions. The models proposed allow for correlation between the two scores, which is a plausible assumption in sports with two opposing teams competing against each other. The effect of introducing even slight correlation is discussed. Using just a bivariate Poisson distribution can improve model fit and prediction of the number of draws in football games. The model is extended by considering an inflation factor for diagonal terms in the bivariate joint distribution. This inflation improves in precision the estimation of draws and, at the same time, allows for overdispersed, relative to the simple Poisson distribution, marginal distributions. The properties of the models proposed as well as interpretation and estimation procedures are provided. An illustration of the models is presented by using data sets from football and water-polo.
| Skellam Regression |
|---|
1. Dimitris Karlis, Ioannis Ntzoufras (2007) - Analysis of Sports Data Using Skellam Distribution
Abstract: Modelling football match outcomes is becoming increasingly popular now a-days for both team managers and betting fans. Most of the existing literature deals with modelling the number of goals scored by each team. In the present paper we work in a different direction. Instead of modelling the number of goals directly, we focus on the difference of the number of goals, i.e. the margin of victory. We recast interest in the so-called Skellam distribution. Modelling the differences instead of the scores themselves has some major advantages. Firstly, we eliminate correlation imposed by the fact that the two opponent teams compete each other and secondly we do not assume that the scored goals by each team are marginally Poisson distributed. Application of the Bayesian methodology for the Skellam`s distribution using covariates is discussed. Illustrations using real data from the English Premiership for the season 2006-2007 are provided. The advantages of the proposed approach are also discussed.
| Recursive Bayesian Estimation |
|---|
1. Leonard Knorr-Held (1999) - Dynamic Rating of Sports Teams
Abstract: We consider the problem of dynamically rating sports teams on the basis of categorical outcomes of paired comparisons such as win, draw and loss in football. Our modelling framework is the cumulative link model for ordered responses, where latent parameters represent the strength of each team. A dynamic extension of this model is proposed with close connections to nonparametric smoothing methods. As a consequence, recent results have more influence in estimating current abilities than results in the past. We highlight the importance of using a specific constrained random walk prior for time-changing abilities which guarantees an equal treatment of all teams. Estimation is done with an extended Kalman filter and smoother algorithm. An additional hyperparameter which determines the temporal dynamic of the latent team abilities is chosen on the basis of the optimal one-step-ahead predictive power. Alternative estimation methods are also considered. We apply our method to the results from the German football league Bundesliga 1996-1997 and to the results from the American National Basketball Association 1996-1997.
2. James R. Ashburn, Paul M. Colvert (2006) - A Bayesian Mean-Value Approach for the Ranking of Football Teams
Abstract: We introduce a Bayesian mean-value approach for ranking all college football teams using only win-loss data. This approach is unique in that the prior distribution necessary to handle undefeated and winless teams is calculated self-consistently. Furthermore, we will show statistics supporting the validity of the prior distribution. Finally, a brief comparison with other football rankings will be presented.
| Kernel Regression |
|---|
1. Ranjeeth Kumar, C. V. Jawahar (2005) - Kernel Approach to Autoregressive Modeling
Abstract: A kernel-based approach for nonlinear modeling of time series data is proposed in this paper. Autoregressive modeling is achieved in a feature space defned by a kernel function using a linear algorithm. The method extends the advantages of the conventional autoregressive models to characterization of nonlinear signals through the intelligent use of kernel functions. Experiments with synthetic signals demonstrate that this method seems to be a promising alternative to nonlinear modeling schemes.
2. Liva Ralaivola, Florence d`Alche-Buc (2005) - Nonlinear Time Series Filtering
Abstract: In this paper, we propose a new model, the Kernel Kalman Filter, to perform various nonlinear time series processing. This model is based on the use of Mercer kernel functions in the framework of the Kalman Filter or Linear Dynamical Systems. Thanks to the kernel trick, all the equations involved in our model to perform filtering, smoothing and learning tasks, only require matrix algebra calculus whilst providing the ability to model complex time series. In particular, it is possible to learn dynamics from some nonlinear noisy time series implementing an exact EM procedure. When predictions in the original input space are needed, an efficient and original preimage learning strategy is proposed.