Researchers have developed an artificial intelligence (AI) tool that uses sequences of life events — such as health history, education, job and income — to predict everything from an individual’s personality to their lifespan.
Built using transformer models, which power large language models (LLMs) like ChatGPT, the tool called life2vec is trained on a data set pulled from the entire population of Denmark.
Life2vec is capable of predicting the future, including the lifespan of individuals, with an accuracy that exceeds state-of-the-art models, the researchers said.
However, despite its predictive power, the research team said it is best used as the foundation for future work, not an end in itself.
“Even though we’re using prediction to evaluate how good these models are, the tool shouldn’t be used for prediction on real people,” says Tina Eliassi-Rad, a professor at Northeastern University, US.
“It is a prediction model based on a specific data set of a specific population,” Eliassi-Rad said.
By involving social scientists in the process of building this tool, the team hopes it brings a human-centered approach to AI development that doesn’t lose sight of the humans amid the massive data set their tool has been trained on.
“This model offers a much more comprehensive reflection of the world as it is lived by human beings than many other models,” said Sune Lehmann, author of the study published in the journal Nature Computational Science.
At the heart of life2vec is the massive data set the researchers used to train their model.
The researchers used that data to create long patterns of recurring life events to feed into their model, taking the transformer model approach used to train LLMs on language and adapting it for a human life represented as a sequence of events.
“The whole story of a human life, in a way, can also be thought of as a giant long sentence of the many things that can happen to a person,” said Lehmann, a professor at the Technical University of Denmark.
The model uses the information it learns from observing millions of life event sequences to build what is called vector representations in embedding spaces, where it starts to categorise and draw connections between life events like income, education, or health factors.
These embedding spaces serve as a foundation for the predictions the model ends up making, the researchers said.
One of the life events that the researchers predicted was a person’s probability of mortality.
“When we visualise the space that the model uses to make predictions, it looks like a long cylinder that takes you from low probability of death to high probability of death,” Lehmann said.
“Then we can show that in the end where there’s a high probability of death, a lot of those people actually died, and in the end where there’s low probability of dying, the causes of death are something that we couldn’t predict, like car accidents,” the researcher added.