Associação Brasileira de Estatística
XIV EBEB - Brazilian Meeting on Bayesian Statistics - Rio de Janeiro

Oral Communications


Oral 1

 

Title: Identifying down and up-regulated chromosome regions using RNA-Seq data
Authors: Vinícius Diniz Mayrink; Flávio Bambirra Gonçalves
Abstract: The number of studies dealing with RNA-Seq data analysis has experienced a fast increase in the past years making this type of gene expression a strong competitor to the DNA microarrays. This paper proposes a Bayesian model to detect down and up-regulated chromosome regions using RNA-Seq data. The methodology is based on a recent work designed to detect up-regulated regions in the context of microarray data. A hidden Markov model is developed by considering a mixture of Gaussian distributions with ordered means in a way that first and last mixture components are supposed to accommodate the under and overexpressed genes, respectively. The model is flexible enough to efficiently deal with the highly irregular spaced configuration of the data by assuming a hierarchical Markov dependence structure. The analysis of four cancer data sets (breast, lung, ovarian and uterus) is presented. Results indicate that the proposed model is selective in determining the regulation status, robust with respect to prior specifications and provides tools for a global or local search of under and overexpressed chromosome regions.
Keywords: Hidden Markov model; Mixture model; Gibbs Sampling; Gene expression; Cancer

 

Title: Logit-linear modelling of Poisson Point Process: a Bayesian option for presence-only type data
Authors: Guido Alberti Moreira; Dani Gamerman
Abstract: In ecology, Species Distribution Models (SDMs) are widely used to learn the occurrence pattern of species in a region. However, the type of data called presence-only has very strong sampling bias, and SDMs need to account for it. Some recent works have begun to model this bias with a probability of occurrence observation which is logit-linear with respect to covariates. This is repeated in our work, as well as the species occurrences model which is a Poisson Point Processes as is intuitive for this type of data. While other works find the maximum likelihood estimate for an intensity function as a log-linear function of covariates, we propose a fully Bayesian analysis as well as a logit link for the intensity, which is divided by a positive value, to ensure a value between 0 and 1. Results based on simulated data are presented and discussed, with some enphasis on misspecification of the model. Finally, identifiability issues raised by the literature are discussed and overcome with the proposed model.
Keywords: Poisson Point Processes; Presence-only; Logit-linear intensity; Bayesian; Ecology

 

Title: Interacting cluster point process model for epidermal nerve fibers
Authors: Nancy L. Garcia; Peter Guttorp
Abstract: The central idea of this paper is to propose a repulsive cluster point process that models the ENFs pattern. The proposed approach is to consider invariant measures of birth and death cluster processes.
Keywords: repulsive cluster processes; Mattern III process; Birth and death processes

 


Oral 2

 

Title: Big Data and BART: a Consensus Monte Carlo Solution
Authors: Lucas Tavares Short Cabral & Hedibert Freitas Lopes
Abstract: Esse trabalho visa estudar em um contexto de Big Data o Bayesian Additive Regression Tree (BART) quando amostrado por um Consensus Monte Carlo (CMC). Para o trabalho Big Data é tratado como um volume de dados tão grande a ponto de ser distribuído entre vários computadores ou quando não é possível armazenar na memória de uma única máquina. O BART é um modelo de regressão não-paramétrica que utiliza da soma várias árvores binárias com profundidade regularizadas, via priori, para construir funções preditivas. Já o CMC é uma maneira de combinar amostras de Monte Carlo de diferentes computadores (ou de partições dos dados) e gerar uma aproximação da posteriori dos dados completos. Os objetivos do trabalho são estudar o comportamento das prioris nesse contexto em relação a capacidade de predição do modelo e verificar como se comporta seleção de variáveis do BART em um CMC. Para atingir o objetivo propõem-se uma nova maneira de agregar os resultados dos diferentes BART’s no CMC usando a correlação de Pearson combinada a inversa da matrix de covariância das amostras como peso. Essa alteração simples mostra melhores resultados nas predições do que os conhecidos na literatura na avaliação dos autores sem elevação substancial do tempo computacional.
Keywords: BART; Consensus Monte Carlo; Big Data

 

Title: Hierarchical stochastic block model for community detection in multiplex networks
Authors: Marina Silva Paez; Lizhen Lin
Abstract: Multiplex networks have become increasingly more prevalent in many fields, and have emerged as a very powerful tool for modeling the complexity of real networks. There is a critical need for developing inference models for multiplex networks that can take into account potential dependency across different layers, particularly when the aim is community detection. We fill this gap by proposing a novel and efficient Bayesian model for community detection of multiplex networks. One of the key features of our model is that it allows the communities at different layers of the networks to vary, which differs from many of existing methods for modeling multiplex networks, which impose that the communities are the same or fixed for all the layers. A random partition prior is imposed for partitions across different layers, allowing dependency on their structure. Marginally, a stochastic block model (SBM) is assumed for each layer. Efficient MCMC algorithms are developed for sampling the posterior of communities, or the partition structure, as well as the link probabilities between nodes or communities. The developed algorithms are applied to extensive simulation studies and data examples which demonstrate the good performance of the models and algorithms.
Keywords: Multiplex networks; Stochastic block models; Hierarchical dirichlet process

 

Title: Multidimensional IRT Models for Hierarchical Latent Structures
Authors: Juliane Venturelli; Flávio Bambirra Gonçalves
Abstract: This work proposes a class of multidimensional IRT models for complex hierarchical latent traits structures that accommodate dichotomous items. The existing higher order IRT models consider only multiunidimensional items. Therefore, the present work extends multidimensional models allowing a flexible hierarchical structure for the latent variables. This approach allows one to model more complex situations, commonly expected for cognitive latent variables. We devise a novel and efficient MCMC algorithm to perform Bayesian inference.
Keywords: Higher Order Models; IRT ; Multidimensional

 


Oral 3

 

Title: Bayesian model averaging over tree-based dependence structures for multivariate extremes
Authors: Sabrina Vettori; Raphael Huser; Johan Segers; Marc Genton
Abstract: Describing the complex dependence structure of extreme phenomena is particularly challenging. To tackle this issue we develop a Bayesian model that describes extremal dependence by taking advantage of the inherent hierarchical dependence structure of the max-stable nested logistic distribution. Our proposed algorithm can identify possible clusters of extreme variables using reversible jump Markov chain Monte Carlo techniques. Parsimonious representations are achieved when clusters of extreme variables are found to be completely independent. Moreover, we significantly decrease the computational complexity of full likelihood inference by deriving a recursive formula for the nested logistic model likelihood. The algorithm performance is verified through extensive simulation experiments which also compare different likelihood procedures. The new methodology is used to investigate the dependence relationships between extreme concentration of multiple pollutants in California and how these pollutants are related to extreme weather conditions. Overall, we show that our approach allows for the representation of complex extremal dependence structures and has valid applications in multivariate data analysis, such as air pollution monitoring, where it can guide policymaking. If time allows, Bayesian modeling of multivariate spatial extremes will also be discussed.
Keywords: Extreme event; Max-stable distribution; Reversible jump MCMC

 

Title: DYNAMIC SPARSITY IN DYNAMIC REGRESSION MODELS
Authors: Paloma Vaissman Uribe; Hedibert Freitas Lopes
Abstract: In the present work, we consider variable selection and shrinkage for the Gaussian dynamic linear regression within a Bayesian framework. In particular, we propose a novel method that allows for time-varying sparsity, based on an extension of spike-and-slab priors for dynamic models. This is done by assigning appropriate Markov switching priors for the time-varying coefficients' variances, extending the previous work of Ishwaran and Rao (2005). Furthermore, we investigate different priors, including the common Inverted gamma prior for the process variances, and other mixture prior distributions such as Gamma priors for both the spike and the slab, which leads to a mixture of Normal-Gammas priors (Griffin et. al, 2010) for the coefficients. In this sense, our prior can be view as a dynamic variable selection prior which induces either smoothness (through the slab) or shrinkage towards zero (through the spike) at each time point. The MCMC method used for posterior computation uses Markov latent variables that can assume binary regimes at each time point to generate the coefficients' variances. In that way, our model is a dynamic mixture model, thus, we could use the algorithm of Gerlach et al (2000) to generate the latent processes without conditioning on the states. Finally, our approach is exemplified through simulated examples and a real data application.
Keywords: spike-and-slab prior; Normal-Gamma prior; shrinkage; dynamic models

 

Title: Bayesian Inference in a copula-based framework for stochastic volatility models with leverage e ect and heavy-tail using MCMC methods
Authors: Lina L. Hernandez-Velasco; Carlos A. Abanto-Valle
Abstract: In this article, we develop a copula-based framework for constructing and fitting exible stochastic volatility (SV) models that capture some of the most important features of stock index returns: negative correlation between returns and future volatility (known as leverage effects), as well as heavy tails of the conditional return distribution. The main idea is to construct the joint conditional density using a Gaussian copula, which allows choosing the marginal error distribution of both, returns and volatility, independently and then stitch them together using the copula. We use Bayesian inference for model parameters estimation performed using Markov chain Monte Carlo (MCMC) methods. A real data set using the S&P 500 index is analyzed.
Keywords: Stochastic volatility; Copulas; Scale mixture of normal distributions

 


Oral 4

 

Title: Bayesian approach for distribution of r-largest order statistics (GEVr) with dynamic seasonal model structure
Authors: Renato Santos da Silva; Fernando Ferraz do Nascimento
Abstract: In series is studied as the behavior of the data can be change over time. This type of change is common for data applied in the theory of extreme values (EVT). In environmental data, for example, in rain, wind and temperature, their levels may be correlated with seasonality, in addition to showing a tendency to increase over the due to climate change on the planet. Environmental data, in most cases cases have a heavy tail, in some situations (EVT) analyzing only the maximum (GEV) of a set of data can provide few observations, in these cases it is more interesting to use the r-greater order statistics (GEVr). This work consists of the development of a new model of extreme values through the Bayesian approach, introducing a Linear Dynamic Seasonal Model (DLMS), which is a sub-class of time series models, to model the parameters of GEVr over the time. The proposed model was applied in the time series of the temperature in ºC Teresina-PI, in order to follow the seasonality of temperature in the Piauí capital.
Keywords: Bayesian inference; Extreme Value Theory; MCMC methods; Dynamic Models

 

Title: Short-Term Extreme and Non-Extreme Wind Speed Forecasting
Authors: Daniela Castro-Camilo; Raphaël Huser
Abstract: Renewable sources of energy such as wind power have become a sustainable alternative to fossil fuels-based energy. However, the uncertainty and fluctuation of the wind speed derived from its intermittent nature brings a great threat to the wind power production stability, and to the wind turbines themselves. A turbine cut-off point denotes how fast the turbine can go before turbine blades are brought to rest to prevent any damage produced by extreme wind speeds. In this work, we develop a flexible temporal model that comprises {in-site and off-site information} to help us to 1) accurately forecast wind speeds, 2) estimate the probability of exceeding the cut-off point, and 3) obtain short-term wind power predictions. Our model can handle non-extreme and extreme observations at the same time, accurately describing both the bulk and the tail of the wind speed distribution. To improve wind speed forecasting, wind direction is automatically incorporated into the model, which means that no previous knowledge on wind patterns is required. Our model belongs to the wide class of latent Gaussian models, therefore we estimate model parameters and predictive distributions by taking advantage of the very powerful and efficient Integrate Nested Laplace Approximation (Rue, H., Martino, S. and Chopin, N. (2009) Approximate bayesian inference for latent gaussian models by using integrated nested Laplace approximations. Journal of the Royal statistical society: Series B (Statistical Methodology) 71(2), 319-392.)
Keywords: Extreme value theory; Threshold-based inference; Latent Gaussian models; INLA; Wind speed forecasting

 

Title: Efficient Closed-Form Maximum a Posteriori Estimators for the Gamma and Nakagami-m distributions
Authors: Francisco Louzada; Pedro Luiz Ramos; Eduardo Ramos
Abstract: The Nakagami distribution play an important role in communication engineering problems, particularly to model fading of radio signals. A maximum a posteriori (MAP) estimator for the Nakagami-m fading parameter is proposed. The MAP estimator has a simple closed-form expression and can be rewritten as a bias corrected generalized moment estimator. Numerical results demonstrate that the MAP estimation scheme outperforms the existing estimation procedures and produces almost unbiased estimates for the fading parameter even for small sample size. The potentiality of our proposed methodology is illustrated in a real reliability data set.
Keywords: Gamma Distribution; Nakagami Distribution; Bayesian Analysis ; Maximum likelihood estimators; Closed-Form estimator