Tools for multivariate data analysis

Ghent University
Monday, June 12, 2017 - 09:00 to Wednesday, June 14, 2017 - 17:00

The course will take place on three days. Each day will involve lecture-style presentations interchanged with practical hands-on sessions using software packages (in R) for multivariate analysis. Much attention will be given to the interpretation of statistical results that are presented in real data examples.

First day:

On the first day, the general features of Factor Analysis and Principal Component Analysis are explained. The general goal of both techniques is to summarize a large number of variables by a smaller number which retains the most important information. The differences and similarities between principal component analysis and (exploratory) factor analysis are discussed, and attention is given to important concepts such as communality, eigenvalue, uniqueness, factor loadings, and Kaiser-Meyer-Olkin criterion. These techniques will be introduced with built-in functions in R (e.g. princomp() and factanal()), but many features will be illustrated by means of the tools in the R package FactoMineR. The following methods will be covered:

  • Data reduction techniques (e.g., scree plots)
  • Extraction methods: Principal Axis Factoring, Maximum Likelihood
  • Rotation methods: Varimax, Promax, Oblimin (with the R package GPArotation)

The first day will conclude with the first steps for a confirmatory factor analysis, which relates observed variables to latent variables (i.e. it tests a measurement model). This will be useful as an introduction to Structural Equation Modeling on the second day. Confirmatory factor analysis will be demonstrated with the R packages psych and lavaan. In all the exercises throughout the day, much attention will be given to the interpretation of factor loadings and factor scores in real data examples.

Second day:

On the second day the principles of Structural Equation Modeling are introduced using the R package lavaan. Structural equation modeling (SEM) is a general statistical modeling technique to study the relationships among observed and unobserved (latent) variables. In this way it spans a wide range of multivariate methods to analyze a large variety of statistical models, including factor analysis and path analysis. The general framework of SEM therefore builds upon the knowledge obtained in the first day. A brief summary of the second day is as follows:

  • Basics of SEM
  • Introduction to the R package  lavaan
  • Model estimation, model evaluation, and model re-specification in practice
  • Reporting analysis
  • A short note on mean structures, multiple groups, and measurement invariance

Third day:

The third day will consist of some special topic(s) in multivariate analysis. As the field is extensive, the organizing university can choose some topic(s) depending on the background and interests of the participants. The topic for this edition is ‘Causal analysis: mediation and instrumental variables’: Structural equation modelling may support causal relationships assumed by the researcher, but is limited in inferring causal relationships. In causal mediation analysis, we consider an intermediate variable, called the mediator, that helps explaining how or why an independent variable influences an outcome. Mediation analysis aims to uncover pathways along which changes are transmitted from causes to effects. Instrumental variables can also be used for causal inference. Topics that will be discussed are: introduction to causality and causal diagrams, differences between causal diagrams and SEM/path analysis, introduction to the potential outcome framework, direct and indirect effects in causal mediation analysis using the R package mediation and instrumental variables in causal inference.

June 12, 13 and 14, 2017 from 9:00 to 17:00
Henri Dunantlaan 1, Ghent (HILO-GUSB), PC room 1 (first floor)


  • PhDs and postdocs of a Flemish University: 0 €
  • Other academics: 180 €
  • Non-profit/social sector: 300 €
  • Private sector: 600 €
For whom: 

This course is targeted to researchers in social and behavioral sciences (e.g., psychology, education, economics, business, and sociology), life sciences and medicine (e.g., health sciences) and natural sciences, who are interested in getting hands-on practice to analyze multivariate data.


Basic understanding of univariate statistics. Basic skills in R are strongly recommended.

Multivariate Statistics
Short Course: 1 < days < 7 / frequently repeated (yearly, quarterly)
Koen Plevoets, Mariska Barendse, Ineke van Gremberghe, Christine Schnor