Research

About

Welcome to my research web page. My general research interest involves data-driven decision-making under uncertainty with a wide variety of applications, including manufacturing, healthcare, infrastructure, and energy systems. My work characterizes the structural properties of the optimal policies for problems of interest and addresses data modeling and computational challenges present in large-scale problems. Motivated by the need to analyze sequential and longitudinal data in manufacturing and health care, I am also interested in statistical machine learning methods. My current work in this direction has focused on the development and use of Bayesian learning methods to improve the diagnostic and prognostic capabilities for manufacturing and healthcare. My research is largely application-driven; close collaborations with scientists and engineers have made up a significant component of my work. My overarching goal is to develop novel and efficient optimization and statistical methods that enhance results in  motivating areas while having much broader general utility. 

My work has been published in refereed journals, such as INFORMS Journal on Computing, European Journal of Operational Research, and IISE Transactions. My research has been funded by NSF, including a CAREER grant, DOE, and industry partners. I direct a number of graduate and undergraduate students in research, which has led to joint publications.  Some of my students aspire to academic positions, while others are attracted to research careers in industry. Some are passionate about exploring theoretical/methodological research, and some are drawn to real-world applications. I seek to channel their strengths and interests and grow them to be leaders in their respective fields. 

I love to teach, encourage, and help students achieve and succeed. I value my time in the classroom. Courses I have taught in the past few years include: Stochastic Processes, Decision Making under Uncertainty, Maintenance Modeling and Optimization, Production and Inventory Control, and Operations Research.

I’m always looking for self-motivated students with interests in decision-making under uncertainty, reinforcement learning, and statistical machine learning.  Interested candidates with a strong background in operations research, mathematics, statistics, computer science or a related field should send CV, transcripts, and TOEFL score to yxiang4 at uh.edu.

 

Decision-Making under Uncertainty

Data-driven decision making

Markov decision processes (MDPs) are often used to model sequential decision-making problems in uncertain dynamic environments. For example, a wide range of equipment maintenance and replacement problems can be formulated as an MDP where a decision maker periodically inspects the condition of the equipment, and makes maintenance decisions, if any, to carry out. Inventory management is another important application of sequential decision making. One of the core problems in inventory management is to develop optimal ordering policies (e.g., reorder point, order quantity) in a supply chain network to deal with demand uncertainty and potential supply disruptions. Many medical decision problems that involve patients’ condition monitoring and selection of treatments can also be formulated as a sequential decision process. In these applications, the underlying distribution of random quantities is often fundamentally unknown and must be estimated from historical data. The estimates are typically subject to large statistical estimation errors due to factors such as limited data, errors, and noise. My research group currently focuses on (1) developing new data-driven methods to model parameters/distributions uncertainty and (2) extending and enhancing existing methods to address specific applications.

Selective Research Projects

  • DATA-DRIVEN REMANUFACTURING PLANNING

Remanufacturing has emerged as a critical element for realizing a sustainable manufacturing industry in the past decade. However, contrary to the conventional wisdom that it reduces environmental impacts, some studies have shown that there are many cases where remanufacturing actually leads to negative outcomes. The goal of my research in this area is to prescribe optimal planning policies to enhance its environmental and economic sustainability. Determining when it is worth remanufacturing first requires modeling of the state transitions of a remanufacturing system. A challenge with estimating the state transitions is that the field data are often limited and typically contain much incorrect information.  To mitigate the effects of parameter estimation errors, we create a novel data-driven modeling framework for remanufacturing planning  in which decision makers can remain robust with respect to statistical estimation errors in transition dynamics. We model the remanufacturing planning problem as a robust Markov decision process, and construct  uncertainty sets  from historical transitions data by utilizing  the notion of “distance” from a reference distribution. We further establish   structural properties of optimal robust policies and  insights for remanufacturing planning. A case study  on the NASA turbofan engine shows that our data-driven optimization framework consistently yields better worst-case performances and higher reliability.

Illustrations of some raw sensor data simulated by NASA's C-MAPSS software.

Multi-stage scenario tree. Two main challenges are: explosion of the problem size, and endogenous uncertainty-- decision dependent transition probability

  • STOCHASTIC MAINTENANCE OPTIMIZATION

I have rich research experiences in decision-making under uncertainty with applications in maintenance optimization. I developed optimal maintenance policies to improve operational efficiencies for various engineering systems. I worked with graduate students to optimize preventive maintenance for complex multi-component systems, such as piping and wind turbines. Multi-component maintenance optimization, which joins the stochastic processes regarding the failures of the components with the combinatorial problems regarding the grouping of maintenance activities, is challenging in both modeling and solution techniques, and has remained an open issue in the literature. We formulated the problem as a multi-stage stochastic integer program with decision-dependent uncertainty, investigated the structural properties of the two-stage problem, and developed efficient algorithms to handle large-size multi-stage problems. Numerical experiments show that our models and algorithms can outperform existing approaches.

Statistical Machine Learning

I am broadly interested in exploring statistical machine learning methods that can aid the study of complex real-world data. My current work in this area has focused on the development and use of machine learning methods for analyzing sequential and longitudinal data in manufacturing and healthcare.

Selective Research Projects

  • Bayesian hidden Markov Models for prediction of clinical events

Hidden Markov models have proven to be excellent general models for approaching learning problems in sequential data, but they have two inherent disadvantages: (1) geometric state duration, and (2) the restrictive assumption of first-order Markovian dynamics. Recent work in Bayesian nonparametrics (e.g., Bayesian higher order hidden Markov models) has addressed the latter issue. We first combined semi-Markovian ideas with the higher order hidden Markov models  to construct a general class of higher order hidden semi-Markov models (HOHSMMs).  We further developed  mixed-effect higher order hidden Markov models (MHOHMMs), where both covariates and random effects are incorporated in  the hidden and conditional parts. We explored the use of the MHOHMM-based classification framework for prediction of clinical events using electronic health records (EHRs). The heterogeneity caused by group-level (e.g., age, gender) and patient-level differences are modeled by covariates and random effects, respectively.   Comparing to  pure “Black-box” machine learning methods that have been devised for clinical event prediction using EHRs, crucial advantages of the proposed approach lie in its ability to provide insights into key scientific queries related to global and local influences of the exogenous predictors.

We developed a mixed-effects higher order HMM-based framework for acute hypotensive episode (AHE) prediction in the ICUs.

  • DEEPMICETL: A DEEP TRANSFER LEARNING-BASED PREDICTION OF MICE CARDIAC ARRHYTHMIAS USING EARLY ELECTROCARDIOGRAMS

Genetic defects are among common causes and risk factors of cardiac arrhythmias. However, developing screening tools for early identification of patients with these defects is challenging because it is difficult to test any such tools on humans. Due to a high genetic homology between mice and humans, we use mouse models to develop a method to predict genetic abnormality associated arrhythmias. We hypothesize that mice with genetic defects present subtle abnormalities in their early ECGs before severe arrhythmias occur and that these subtle patterns can be detected by deep learning methods. To address mice ECG data scarcity, we propose a deep transfer learning model, DeepMiceTL, which leverages knowledge from human ECGs to learn mice ECG patterns. We further apply the Bayesian optimization and $k$-fold cross validation methods to tune the hyperparameters of the DeepMiceTL. Our results show that DeepMiceTL achieves promising performance (F1-score: 88.7\%, accuracy: 87.26\%) on predicting the occurrence of gene-associated arrhythmias using early mice ECGs. In addition, to reduce the need of continuous mice ECG monitoring and facilitate effective experiments management, we use the proposed DeepMiceTL to predict the occurrence time of severe arrhythmias and obtain a satisfactory performance

the workflow of the proposed method