学术报告
2016年度京津地区青年统计学者学术沙龙
2016年度京津地区青年统计学者学术沙龙
时间:10月26日,9:00--12:00
地点:北京中关村东路55号数学院南楼602
报告一:Network Imputation for Spatial Autoregression Model with Incomplete Data 时间:9:00--9:40
孙志猛,中央财经大学统计学院
摘要:Missing data are typically encountered in practice, and various imputation methods have been developed and popularly used. However, existing imputation methods are mainly developed for independent data and the assumption of in dependence ignores the connections of units through various social relationships (e.g., friendship, follower-followee relationship). In fact, the observed responses from connected friends should provide valuable information for missing responses. This factor motivates us to conduct imputation by borrowing information from connected friends using a network structure in this paper. With the missing at random assumption and using observed information only, we propose a partial likelihood approach and develop the corresponding maximum partial likelihood estimator (MPLE). The estimator's consistency and asymptotic normality are established. Using the MPLE, we then develop a novel regression imputation method. The method utilizes both auxiliary information and connected complete units (i.e., network information), and using the imputed data, we can compute the sample mean of the responses. This method is shown to be consistent and asymptotically normal. Compared with the imputation method using auxiliary information only (i.e., network information is ignored), the proposed estimator is statistically more efficient. Extensive simulation studies are conducted to demonstrate its finite sample performance. We then analyze a real example about QQ in mainland China for illustration.
报告二:Covariate-adaptive randomization with variable selection in high dimensional data 时间:9:40--10:20
尹建鑫,中国人民大学统计学院
摘要:In clinical trials, balancing treatment allocation for influential covariates is critical. In big data era, the number of measured covariates is usually much larger than the number of recruited patients and growing very fast as the sample size increases, among which only a small fraction of them are relevant to the given response. Recently, Hu and Hu (2012) proposed a new covariate-adaptive randomization procedure which can control three types of imbalance. However, they assume all the relevant covariates are given. How to select the potentially important covariates among a diverging number of candidates and balance treatment allocation upon the selected variable set is the main subject of this paper. We study the new situation in adaptive randomization for a diverging and varying-dimensional difference tensor, and redefine the new Markov chain for the stratum difference. Treating each covariate’s coefficient under two treatment as a group, we select the important variables under the framework of multi-task learning via group-LASSO algorithm. From the variable selection and multi-task learning aspect, we relax the condition of equal sample sizes and allow unequaled sample sizes for different treatments. Compared to Lounici et al. (2011), we get a new result about the probability in the non-asymptotic oracle inequality which depend on the sample size. These two new results for multi-task learning are of independent interest. Finally we show under certain regulatory conditions, the regularized adaptive design method can control the asymptotic variance and select the true influential covariates set simultaneously. In addition to balancing treatment allocation, we also show the treatment effects can be more accurately compared under this covariate-selected-and-adjusted framework in the sense of having smaller variance and MSE. Simulation study has shown support to our theoretical discovery for the proposed method.
报告三:Screening-Assisited Dynamic Multiple Testing with False Discovery Rate Control 时间:10:30--11:10
邹常亮,南开大学,统计科学研究院
摘要:In the era of big data, high-dimensional data always arrive in streams, and thus online accurate decision is necessary. In many such applications, rapid sequential identification of individuals whose behavior is different from the behavior of the majority or a target pattern has become increasingly important. Aiming to trigger as more signals of irregular pattern as possible after those individuals' behavior deviate from the regular behavior, we develop a large-scale dynamic testing system in the framework of false discovery rate control (FDR). By fully exploiting the sequential feature of this problem, we propose a procedure which performs both streams filtering and testing at each time point and then only tests streams which pass the filter in the previous step. A data-driven optimal screening threshold is derived, which gives the new method an edge to potentially outperform existing methods. Under some mild conditions on the dependence structure, the FDR is shown to be strongly controlled pointwise and the suggested approach for determining screening thresholds is asymptotically optimal.
报告四:Regression Analysis of Current Status Data in the Presence of A Cured Subgroup and Dependent Censoring
时间:11:10--11:50
胡涛,首都师范大学77779193永利官网
摘要:This paper discusses regression analysis of current status data, a type of failure time data where each study subject is observed only once, in the presence of dependent censoring. Furthermore, there may exist a cured subgroup, meaning that a proportion of study subjects are not susceptible to the failure event of interest. For the problem, we develop a sieve maximum likelihood estimation approach with the use of latent variables and Bernstein polynomials. For the determination of the proposed estimators, an EM algorithm is developed and the asymptotic properties of the estimators are established. Extensive simulation studies are conducted and indicate that the proposed method works well for practical situations. A motivating application from atumorigenicity experiment is also provided.
主办单位:
中国科学院数学与系统科学研究院
首都师范大学77779193永利官网