Most organizations have a wellness program. Fewer have one that does anything useful.
The gap between a genuine workplace employee wellness strategy and a collection of health talks and fruit baskets is large, and the research is clear about which side of that gap most programs fall on. A meta-analysis in Health Affairs found that medical costs fall by about $3.27 for every dollar spent on wellness programs, and absenteeism costs drop by about $2.73. Those numbers entered HR folklore. What happened afterward in the experimental literature is the part most organizations don't know about.
The common pattern we encounter is that an organization runs a wellness program for a year or two, can't tell you whether it changed anything, then either doubles the budget or quietly cancels it. Neither response is based on evidence.
This article goes through what the peer-reviewed science actually shows. That includes findings that complicate the optimistic ROI story and what they mean for how you design a program that can stand up to scrutiny.
Why the Evidence Base Has Changed Since 2010
The Baicker, Cutler, and Song (2010) meta-analysis in Health Affairs synthesized 36 studies and produced the $3.27 and $2.73 figures that became the default justification for corporate wellness investment globally. The finding made intuitive sense: healthier employees cost less. The paper became widely cited, and the ROI estimates were widely repeated.
Then RCT evidence arrived, considerably changing the picture.
The Illinois Workplace Wellness Study, published in the Quarterly Journal of Economics, was the first large-scale, preregistered, independently replicated randomized controlled trial in this literature. Across a 30-month follow-up, researchers found no significant effects on medical spending or absenteeism. Their 99% confidence interval explicitly ruled out the Baicker et al. estimates. They also found something important: prior observational studies were systematically biased. Healthier employees self-selected into wellness programs, which made programs look more effective than they actually were. The apparent ROI was partly an artefact of measurement.
A second large RCT from Song and Baicker (2019) in JAMA reached similar conclusions. Across 160 worksites of a large US retailer, the program improved some self-reported health behaviors at 18 months. But it produced no significant differences in clinical health markers, healthcare spending, absenteeism, job tenure, or job performance.
A scoping review of 47 economic evaluations by Unsal and colleagues (2021) found that ROI estimates vary enormously depending on which costs are included. Programs with disease management components consistently produced higher returns than lifestyle-only programs. Studies that included presenteeism costs tended to show lower ROI. That is because presenteeism is large in magnitude but hard for wellness programs to move.
None of this means wellness programs are a waste of money. It means that vague, poorly designed programs probably are, and that the $3.27 figure should no longer be used as a default expectation.
What Physical Activity Programs Actually Deliver
Physical activity-based programs have the most consistent scientific track record among wellness interventions. A 2023 systematic review by Marin-Farrona and colleagues, registered in PROSPERO and compliant with PRISMA guidelines, examined worksite physical activity programs and found consistent improvements in cardiorespiratory fitness, body composition, and musculoskeletal health. Programs that combined aerobic and strength-based exercise produced broader health benefits than single-modality interventions.
The Lancet Public Health meta-analysis (2021) is the most comprehensive in this space. It screened more than 10,000 abstracts and included 121 controlled studies, 82 of which were randomized controlled trials, spanning three decades of multicomponent wellness research. The pooled findings showed statistically significant improvements in dietary quality, body weight, waist circumference, blood pressure, cholesterol, and blood glucose compared to control groups. The evidence was strongest for programs that combined multiple behavior change strategies rather than for those that addressed a single health domain.
A 2024 systematic review on physical activity-led health promotion interventions found that these programs also produce mild to moderate improvements in psychological outcomes. Specifically stress, depression, and anxiety symptoms alongside the physical effects. That cross-domain benefit is practically important. It challenges the assumption that physical and mental wellness programs need to run separately.
The catch is participation. Programs that attract mainly already-healthy employees generate minimal marginal benefit. Getting physically inactive or higher-risk employees to engage consistently is where most programs struggle, and it's where the financial returns actually come from.
Related: HR's Role in Managing Employee Mental Health
The Mental Health Evidence Is More Complicated
Mental health interventions in the workplace have a nuanced track record. The Joyce et al. systematic meta-review (2016) in Psychological Medicine reviewed all controlled evidence on workplace interventions for depression and anxiety. The review found that work-focused cognitive behavioral therapy (CBT), particularly when combined with organizational accommodations, produced the strongest reductions in both symptom severity and time off work. Individual therapy that ignored workplace conditions showed benefit for symptoms but less consistent impact on actual work outcomes.
A 2023 Lancet paper on work-related causes of mental ill health makes the point plainly: individual interventions produce limited population-level benefit when the organizational conditions causing poor mental health remain unchanged. Excessive workloads, poor management, low job control, and inadequate pay are documented causes of mental health problems at work. Offering employees mindfulness apps while leaving those conditions in place is not a mental health strategy.
For digital mental health programs specifically, a 2025 JMIR meta-analysis of 81 RCTs across 25,500 participants found small but statistically significant pooled effects on depression (effect size 0.167), anxiety, and stress. CBT-based digital programs and those incorporating human support elements produced larger effects than fully automated ones. An app without a therapeutic structure does not perform like one with it.
An earlier JMIR systematic review and meta-analysis by Carolan and colleagues (2017) found significant improvements in psychological wellbeing and work effectiveness from web-based psychological interventions. CBT-based approaches consistently outperformed psychoeducation-only or mindfulness-only programs.
Screening Without Treatment Access Is a Problem
One finding from the evidence base that organizations frequently overlook: mental health screening by itself doesn't help employees. A 2023 systematic review in BMJ Open examined workplace mental health screening programs specifically. Screening followed by advice or referral alone produced no meaningful improvement in employee mental health (pooled effect size -0.07, 95% CI -0.29 to 0.15). Only screening followed by facilitated access to actual treatment interventions produced a small positive effect.
Many organizations are investing in screening tools without building the treatment referral infrastructure that makes screening worthwhile. Screening an employee, identifying that they're struggling, and then pointing them toward a waiting list or an EAP number is not a treatment pathway. It may even increase distress by raising awareness of problems without resolving them.
The practical sequence from the evidence: build treatment access first. Then screen.
What the Economic Evidence Shows
The de Oliveira et al. (2020) systematic review examined the economic evidence on workplace mental health and substance use interventions. It covered both direct cost savings: healthcare utilization and absenteeism and indirect returns through reduced presenteeism. The review found something consistent: programs that improved clinical outcomes also generated positive economic returns. Programs that didn't change clinical outcomes also didn't save money.
That relationship matters for how you evaluate programs. If your wellness initiative can't show improved health behaviors, reduced symptom scores, or better clinical markers, waiting for a financial return is not well supported by the evidence.
Organizations with strong mental health programs see estimated returns of around $4 per dollar invested, according to a 2024 systematic review in the American Journal of Public Health, based on controlled studies rather than observational comparisons. The estimate includes reduced absenteeism, lower healthcare utilization, and improved retention. It's credible precisely because it doesn't come from vendor research.
What Separates Programs That Work From Programs That Don't
After reviewing the full body of evidence, the differences between effective and ineffective wellness programs are fairly clear. They're not primarily about budget.
Multicomponent programs outperform single-domain ones
The Lancet Public Health 2021 meta-analysis consistently found that programs addressing multiple health domains together produced larger and more sustained effects than single-component interventions. A gym subsidy is not a wellness program. It's one component of one health domain. On its own, its population-level impact is modest.
Duration matters
The Marin-Farrona et al. review (2023) found that programs shorter than 12 weeks showed weaker effects across the board. Twelve-week programs produce behavior changes. Sustained behavior change. The health outcomes that follow takes considerably longer. Many organizations run a 6-week wellness challenge, see no clinical data move, and conclude wellness programs don't work. The more likely conclusion is that 6 weeks isn't enough.
Who participates matters more than how many participate
The Illinois Workplace Wellness Study demonstrated what practitioners have long suspected: the employees who sign up for wellness programs are disproportionately healthy already. Getting the higher-risk employees engaged is the challenge. Those with elevated blood pressure, high stress scores, and sedentary lifestyles is where the health and financial returns actually come from. This requires active outreach, reduced access barriers, and sometimes direct clinical prompting rather than a newsletter invitation.
Organizational conditions can't be ignored
The Lancet mental health review (2023) is explicit on this. Individual-level wellness interventions produce limited impact when the organizational conditions causing ill health remain in place. If your data shows that stress and mental health are the main drivers of poor employee wellbeing, and your organization has 12-hour expectations, poor management, and inadequate pay. No wellness program will fix that. The program and the working conditions need to be addressed together.
Related: Human Resource Best Practices Guide
Digital Wellness Programs: Realistic Expectations
Digital wellness platforms attract a lot of employer attention right now, partly because they're scalable and partly because COVID-19 accelerated mental health app adoption. The JMIR meta-review of digital wellness programs, which covered 29 systematic reviews of studies published between 2000 and 2023, found mental health was the most studied domain. It was featured in 19 of the 29 included reviews.
The evidence for digital delivery is encouraging in some respects. The 2025 JMIR meta-analysis found effect sizes of around 0.2 to 0.4 for structured CBT-based digital programs that combine automated delivery with human support elements. Those are small-to-moderate effects. They're meaningful at a population scale, but they won't be transformative for individuals with more serious presentations.
A 2024 systematic review on digital health interventions in workplaces found that programs combining behavioral techniques with tracking and peer support showed the most consistent results. Technology-only tools without behavioral structure showed lower effectiveness.
Digital delivery is a channel. What matters is the content that runs through it. A CBT-based digital program delivered via an app is not the same as a wellness app that delivers health tips. The research supports the former. It's much less supportive of the latter.
Measuring Whether Your Program Is Working
The RCT literature offers an uncomfortable lesson about measurement. Both the Illinois Workplace Wellness Study and the Song and Baicker JAMA trial found that self-reported behavior improvements, the kind most organizations track, did not translate to clinical health improvements or business outcomes. Programs that measure only participation rates, employee satisfaction scores, or survey responses about whether employees feel more supported are measuring process, not outcomes.
The scoping review by Unsal et al. (2021) noted that ROI estimates shift substantially depending on which outcomes you measure. If your measurement framework can't detect whether health actually changed, you won't be able to tell whether your program worked.
The minimum outcome set the scientific literature uses, and which organizations should track over at least 12 months, includes: sick leave rates and duration, validated clinical measures for the health domains being addressed (blood pressure, BMI, cholesterol for physical programs; validated depression and anxiety scales for mental health programs), and work performance data. That's what allows a genuine before-and-after comparison.
A 12-month minimum isn't arbitrary. Seasonal variation in sick leave, the Hawthorne effect from simply running a program, and selection effects among early adopters can all produce apparent short-term improvements that wash out. You need a year of data to start distinguishing real program effects from noise.
How to Design a Program That Earns Its Budget
None of the evidence points toward a single right answer. It points to several design decisions that consistently distinguish programs that produce results from those that don't.
Start with a genuine needs assessment. Survey your employees anonymously about health concerns, identify which groups carry the highest health risk, and build your program around those findings. A program built on what employees actually need will always outperform one built on what a vendor suggested.
Address multiple health domains. A physical activity program, a mental health program, and a financial wellness component, run together, will produce more than any one of them in isolation. The multicomponent evidence is consistent across the literature.
Build treatment access before you screen. Don't survey employees about mental health distress and then leave them with a helpline number. Build the referral pathway, confirm which treatments are accessible and affordable, and only then introduce the screening tool.
Design for the higher-risk employees, not the enthusiastic healthy ones. Programs that fill up with already-healthy participants are delivering most of their sessions to people who need them least. Think about what would make your sedentary, stressed, or financially stretched employees show up. Design around that.
Run it for long enough to measure it. Commit to at least 12 months of program operation and 12 months of outcome data before making a judgment on effectiveness. Track clinical and business outcomes, not just participation rates.
Check what's causing the problem before prescribing a solution. If your sick leave data is driven by occupational stress from poor management or excessive workloads, no number of yoga sessions will change it. Look at what the metrics are actually telling you about root causes, and address those directly alongside whatever wellness programming you run.
The Honest Version of the Business Case
Early ROI estimates for wellness programs were overly optimistic. The JAMA and Quarterly Journal of Economics RCTs made that clear, and the broader evidence base now reflects it. Well-designed, multicomponent programs addressing physical and mental health can produce positive health outcomes and generate real economic returns. The honest qualifier is that poorly designed ones. Especially short-duration, single-domain, or passively delivered programs that primarily attract already-healthy employees are unlikely to produce meaningful financial benefit.
What the evidence clearly supports is this: the return on a good program depends almost entirely on design quality, program duration, who participates, and whether organizational conditions are addressed alongside individual health behaviors. Those are variables your organization controls.



