How Do We Know Organizational Development Actually Works?
- ינון עמית
- 12 minutes ago
- 16 min read
A Guide to Evaluating the Effectiveness of Organizational Development Processes
There is a moment like this in almost every organization. A conference room. The end of a long day. On the table sits a summary presentation of an organizational development process that has taken months. Leadership workshops, team work, learning processes, executive coaching. Then one of the managers asks, “Okay. But how do we know this actually works?”
The question sounds innocent. Sometimes it is even asked with a smile. But within seconds it changes the atmosphere in the room. Because despite all the work that has been done, despite the sense of progress, despite the stories coming from the field, it is difficult to give a short and clear answer.
How do we actually measure the effectiveness of organizational development?
Not the number of workshops. Not satisfaction with the training. The real question is something else entirely.
Has something in the organization actually changed?
Do managers make decisions differently?
Do teams collaborate more effectively?
Is the organization better able to deal with complexity?
These are not simple questions. In fact, they sit at the center of one of the most interesting paradoxes in the field of management. On the one hand, almost every large organization today invests in organizational development. Leadership programs, change initiatives, management capability development. On the other hand, when organizations try to evaluate the effectiveness of these investments, the task turns out to be far more complex than it initially appears. Not because organizational development does not work. In many cases it works very well. The difficulty arises because organizational impact unfolds in indirect, cumulative, and sometimes unexpected ways.
Over the past decades, several major approaches have emerged in the research literature in an attempt to address this challenge. Some focus primarily on measurable outcomes. Others emphasize change processes. Still others view evaluation itself as an integral part of organizational learning.
In this article I will try to bring some order to this complex field. We will examine three major models for evaluating the effectiveness of organizational development processes, explore the advantages and limitations of each, and then propose an integrative framework that combines their insights.
Before we turn to the models themselves, however, it is worth pausing for a moment to consider the question behind the question. Why is it so difficult to evaluate the effectiveness of organizational development in the first place?
The answer, as we will soon see, is closely related to the way organizational change actually unfolds.

Why Organizational Development Is So Difficult to Measure?
When an organization implements a new information system, evaluating its impact is relatively straightforward. We can examine processing time, the number of errors, operating costs, or productivity. In many cases it is possible to identify a clear before and after difference.
Organizational development operates in an entirely different space. Leadership development processes, cultural change initiatives, and team development efforts do not modify a technological system. They reshape patterns of thinking, relationships between people, and the ways decisions are made. Changes of this kind occur gradually. They are influenced by the broader organizational context, by leadership, by external events, and sometimes by unexpected crises. As a result, isolating the contribution of a single intervention within a wide set of influences is extremely difficult.
This challenge creates a deep paradox. On the one hand, organizations expect organizational development to influence performance, innovation, service quality, and the ability to adapt to change. On the other hand, traditional tools for measuring effectiveness struggle to capture processes that unfold within complex social systems.
For this reason, the professional literature in the field has attempted over the years to develop different models for evaluating the effectiveness of organizational interventions.
Why Organizational Surveys Sometimes Confuse Us
One of the most puzzling moments in organizational development processes occurs when the data does not behave as expected.
Imagine an organization that has invested an entire year in developing its managers. Workshops were conducted, personal coaching was provided, and teams engaged in structured learning processes. On the ground, the feeling is that something has changed. Managers report having different kinds of conversations with their employees. Collaboration between units improves. Then the organizational survey arrives.
Surprisingly, some of the indicators decline. Satisfaction with management does not increase. In some cases it even drops slightly.
At this point the question arises almost immediately. Did the process fail?
Not necessarily.
To understand why, it is important to recognize a distinction developed in the evaluation literature. During an organizational intervention, three different types of change may occur.
The first type is known as alpha change. This is the simplest case. Something in the organizational reality has genuinely changed. For example, collaboration between teams has improved in practice, or managers have begun providing higher quality feedback to employees. When we measure before and after the intervention, we expect to see this difference reflected in the data.
However, something else often happens during organizational development processes. Sometimes not only the organizational reality changes, but also the way people measure it. This is known as beta change.
Imagine a manager who at the beginning of a process rated himself very highly on a measure of team management. Not because he was an exceptional manager, but because he had no clear standard for what good management really looks like. After a year of learning, exposure to advanced leadership models, and in depth conversations with colleagues, that same manager suddenly understands how complex managing a team truly is. The next time he completes the survey, he gives himself a lower score.
Has he become a worse manager? Often the opposite is true. The standard has changed.
There is also a third and deeper form of change. This is gamma change. Here it is not only the scale that changes, but the definition of the phenomenon itself. For example, managers who entered a development process with a hierarchical view of leadership may gradually begin to see leadership as the creation of partnership and trust. When the concept itself changes, the way participants interpret survey questions changes as well.
As a result, when we measure organizational change over time, the data does not necessarily reflect only changes in organizational reality. It may also reflect changes in how people understand that reality.
When we examine these three types of change more closely, it becomes clear that they are not merely a measurement problem. They reflect different depths of change within the organizational system itself. Alpha change generally corresponds to what systems thinkers describe as first order change. Behaviors improve and processes evolve, but the basic rules of the system remain the same. Beta change suggests something deeper. Here not only behavior changes, but also the standards by which it is judged. What once counted as good management may no longer be sufficient. Gamma change goes even further. The organization begins to redefine the meaning of concepts such as leadership, collaboration, or responsibility. In this sense it resembles what scholars describe as third order change, a transformation in the underlying assumptions through which the system understands itself.
This insight helps explain why evaluating organizational development is so complex. Successful interventions do not change behavior alone. They change awareness, standards, and interpretations. When those interpretations change, the measurement itself changes as well.
The implication is important. Evaluating the effectiveness of organizational development cannot rely on a single indicator. It must combine quantitative data, an understanding of organizational processes, and deeper analysis of how people interpret the reality in which they operate.
It is precisely at this point that different approaches to evaluating organizational development become relevant.
The Four Level Model for Evaluating Training and Development
One of the most influential approaches to evaluating learning and development programs was developed by Donald Kirkpatrick. His model proposes evaluating interventions at four different levels.
The first level focuses on participants’ reactions. It examines how participants experienced the process. Did they find it relevant, engaging, and useful? Most organizations measure this level through feedback questionnaires. The second level examines learning itself. Did participants acquire new knowledge or skills? At this stage organizations may use assessments, simulations, or professional evaluations. The third level focuses on behavioral change. The key question here is whether participants actually apply what they learned in their work. The fourth level examines the impact on organizational performance. Did the intervention contribute to improvements in productivity, service quality, employee satisfaction, or business outcomes?
The central contribution of this model lies in clarifying that not all measures of effectiveness capture the same phenomenon. Satisfaction with a program does not necessarily indicate learning, and learning does not automatically translate into behavioral change.
Despite the model’s importance, applying it within complex organizational settings raises several significant challenges. As one moves upward through the levels, it becomes increasingly difficult to establish a clear connection between the intervention and the observed results.
The Problem With Level Four
In many organizations there is an expectation that leadership development programs will ultimately lead to improvements in organizational performance. Yet when researchers attempt to examine this connection empirically, its complexity becomes apparent. Suppose an organization invests in a senior leadership development program. A year later the organization’s performance improves. Can this improvement be attributed to the development program? Perhaps. But it might also result from regulatory changes, economic recovery, strategic shifts, or the appointment of new leaders. The difficulty is not that the intervention has no effect. The difficulty lies in isolating its contribution within a complex system.
One way to address this problem relies on experimental research methodology. In this approach researchers attempt to compare groups that participated in the intervention with groups that did not. This method allows for relatively strong causal inference and is widely used in scientific research and in fields such as medicine and education. However, applying this method in organizational settings is often difficult. Creating genuine control groups can be challenging. Organizational units influence one another. External variables are difficult to control. In addition, organizational change is rarely a single event. It unfolds over time and interacts with many parallel processes.
To understand why evaluating organizational development is so complex, we must examine how organizational change actually unfolds. When senior leaders learn a new management model, the organization does not immediately transform. Initially learning occurs. Some managers begin experimenting with new ideas. Gradually patterns of work change, relationships evolve, and sometimes elements of organizational culture begin to shift. This process is not linear. It emerges through countless interactions between individuals, teams, and different managerial levels. In other words, organizational development influences complex social systems. This understanding has led to the emergence of new approaches to evaluating organizational change.
Developmental Evaluation of Change Processes
One particularly interesting approach to evaluating complex interventions was developed by Michael Quinn Patton. This approach, often referred to as developmental evaluation, starts from the assumption that change processes in organizations are not always predictable.
Rather than measuring success solely through predefined indicators, the approach suggests examining how an intervention influences the organization’s capacity to learn and adapt. In other words, evaluation focuses not only on final outcomes but also on the dynamics of organizational learning. This perspective is particularly relevant in situations where organizations operate in complex and rapidly changing environments.
Another response to the limitations of outcome focused evaluation approaches has been the development of process evaluation. This approach examines not only whether an intervention succeeded, but how it was implemented in practice. Process evaluation addresses questions such as:
Was the intervention implemented as intended?
How did participants experience the process?
What factors supported implementation and what factors hindered it?
Through what mechanisms did change occur in behavior or organizational functioning?
By examining the implementation process, organizations can better understand the connection between the intervention and its outcomes. If an intervention fails to produce the expected results, process evaluation may reveal whether the problem lies in the design of the intervention, in its implementation, or in the organizational conditions within which it took place. This approach allows researchers and practitioners to open the black box of organizational change and accumulate knowledge about the conditions under which interventions succeed or struggle to generate impact.
Another perspective that has gained attention in recent years is the realist approach to evaluation. This perspective is built on the idea that the key question in evaluating an intervention is not simply whether it worked, but under what conditions it works and through what mechanisms. Realist evaluation focuses on three core elements. The context in which the intervention takes place, including the organizational and environmental conditions that influence implementation. The mechanisms through which the intervention operates, meaning the psychological and social processes through which it influences participants’ behavior. The outcomes that emerge from the interaction between context and mechanisms. This approach helps explain why an intervention may succeed in one organizational context but not in another. It therefore provides an important foundation for building practical knowledge about the conditions under which organizational development interventions can produce meaningful change.
Several years ago a large government organization conducted an extensive leadership development process. The program included workshops, team based learning, and coaching for senior managers. At the end of the first year the organization’s leadership requested an evaluation of the program’s effectiveness. The initial data was encouraging. Participants reported high satisfaction and significant learning. Yet when analysts attempted to identify changes in organizational functioning, the picture became more complex. In some units, management practices clearly changed. Managers began working in a more structured and collaborative way, sharing information and involving employees in decision making. In other units, almost no change occurred. When the differences between units were examined more closely, an interesting pattern emerged. The key difference was not the development program itself but the organizational context. In units where senior leaders actively supported the process and integrated it into daily work routines, the change was substantial. In other units the program remained a valuable learning experience but did not influence patterns of work. This example illustrates how strongly the effectiveness of organizational development depends on the broader organizational context.
An Integrative Framework for
Evaluating Organizational Development
Based on both the research literature and practical experience in organizations, it is possible to propose an integrative framework for evaluating the effectiveness of organizational development. This framework distinguishes between three complementary levels of impact. The first level concerns the intervention itself. At this level we examine the quality of the process, its relevance to participants, and the learning it generates. The second level concerns behavioral change. Here we examine whether the intervention led to changes in work patterns, decision making processes, and interactions between managers and employees. The third level concerns broader organizational impact. At this level we examine whether the behavioral changes contributed to improvements in the organization’s ability to deal with challenges, innovate, and deliver high quality services. The key advantage of this framework lies in its ability to clarify the relationship between these three levels.
Principles for Designing an Organizational Evaluation Model
The review presented so far highlights that evaluating the effectiveness of organizational development cannot rely on a single measurement model or a single indicator of success. Organizational change processes unfold within complex systems, are influenced by multiple factors, and develop over time. An effective evaluation model must therefore reflect this complexity and rely on several complementary principles. These principles are not merely technical guidelines for selecting measurement tools. They reflect a broader conception of evaluation as an organizational learning process aimed at understanding how development interventions generate change and how their effectiveness can be improved over time.
Multidimensional Evaluation of Effectiveness
One of the central lessons from the literature is that the effectiveness of an organizational development intervention cannot be captured by a single metric. Such interventions may influence several dimensions of organizational functioning simultaneously, including the quality of the intervention process, the behaviors of managers and employees, and broader organizational outcomes.
An effective evaluation model should therefore be multidimensional and distinguish between different types of impact. This distinction helps organizations better understand the contribution of the intervention and the way influence at one level may eventually translate into influence at other levels. For example, a leadership development program may be highly successful in terms of learning and participant experience, yet its influence on organizational performance may remain limited if no change occurs in the way the leadership team actually operates.
Combining Quantitative and Qualitative Data
Organizational change processes involve many aspects that cannot be fully captured through statistical indicators. Variables such as the quality of collaboration between units, levels of trust between managerial levels, or the quality of decision making processes cannot always be measured through quantitative metrics alone.
At the same time, quantitative measurement offers an important advantage because it allows comparisons across time and between different units within the organization.
For this reason an effective evaluation model should combine quantitative and qualitative data. Quantitative data may include organizational surveys, performance indicators, productivity measures, or human resources data. Qualitative data may include in depth interviews, focus groups, observations of work processes, or analysis of organizational documentation.
Combining these forms of data allows for a richer and more accurate understanding of how an intervention influences the organization.
Evaluation Over Time
The impact of organizational development interventions rarely occurs immediately. Processes of learning, behavioral change, and the adoption of new patterns of work take time. Evaluation conducted only at the end of an intervention may therefore provide an incomplete picture.
An effective evaluation model should include measurement at multiple points in time. A baseline measurement before the intervention provides a reference point. Another measurement at the end of the process can assess participant experience and learning. Additional measurements several months later allow organizations to examine whether behavioral and organizational changes have actually occurred.
Integrating Evaluation Into the Intervention Process
In many cases evaluation is treated as a separate stage that takes place only after an intervention has been completed. Contemporary approaches to organizational development increasingly emphasize that evaluation should be integrated into the intervention itself.
When evaluation mechanisms are embedded within the process, it becomes possible to identify implementation challenges in real time, adjust the intervention when necessary, and strengthen its effectiveness.
Focusing on Mechanisms of Change
Evaluating an intervention should not focus solely on whether change occurred. It is equally important to understand how the change occurred. Focusing on mechanisms of change means examining the psychological and social processes through which the intervention influences participants’ behavior. For example, a leadership development process may improve decision quality by strengthening trust among leadership team members, by developing a shared language for discussing strategic issues, or by creating new mechanisms for sharing information. Understanding these mechanisms allows organizations to better grasp how interventions operate and under what conditions they are most effective.
Measuring Systemic Capabilities
Many organizational development interventions aim not only to improve immediate outcomes but also to build long term organizational capabilities. These may include the organization’s capacity for learning, the quality of decision making processes, the level of collaboration between units, or the ability to adapt to change. Measuring such capabilities is more complex than measuring short term performance outcomes, yet it is essential for understanding the long term contribution of organizational development. An effective evaluation model should therefore include indicators that track the development of these systemic capabilities over time.
An Integrative Model for Evaluating Organizational Development
Based on the principles discussed above, an integrative model for evaluating the effectiveness of organizational development processes can be proposed. The model addresses the central challenge of evaluating complex interventions by combining an understanding of the development process itself, the behavioral changes it produces, and its longer term influence on organizational performance.
The model includes three complementary levels of evaluation. The first level is the process level. This level focuses on the quality of the organizational development intervention itself, including the way the process was designed and implemented and the experience of participants. The second level is the behavioral change level. Here the evaluation examines whether managers and employees actually changed the way they work, make decisions, and interact with one another. The third level is the organizational results level. At this stage the evaluation examines whether the behavioral changes contributed to improvements in organizational functioning.
Connections Between the Levels of Evaluation
One of the key advantages of this model is the ability to analyze the relationships between the three levels of evaluation. For example, it may occur that an intervention is perceived as highly valuable at the process level but does not lead to significant behavioral change. In such cases the intervention may have been engaging and intellectually stimulating but lacked mechanisms that support implementation within the organization. In other cases behavioral changes may occur without clear improvements in broader organizational results. This may indicate that the scope of the change was limited or that more time is needed before the effects appear in organizational indicators. Analyzing these patterns allows organizations to better understand the dynamics of change and to improve future development initiatives.
Conclusion
Ultimately, the question of organizational development effectiveness is not only a methodological question. It is a deeper question about how organizations understand change. Modern management has become accustomed to measuring almost everything. Productivity, performance, efficiency, costs. In many systems there is even an implicit assumption that if something cannot be measured precisely, it may not truly exist.
Organizational development belongs to a different category of phenomena. It operates through changes in conversations between people. Through the way managers interpret complex situations. Through patterns of trust that gradually emerge between teams. Through the organization’s ability to learn through experimentation and reflection. These processes are difficult to measure directly. Yet they are often the deepest drivers of long term organizational performance.
This is why the question of how to measure organizational development repeatedly leads to a dead end. Not because there are no good ways to evaluate effectiveness, but because the simple measurement that many managers hope to find does not exist. What does exist is the need for a more sophisticated approach to evaluation. An approach that recognizes that organizational impact emerges at the intersection of results, processes, and systemic learning. An approach that views evaluation not only as a mechanism of control but also as a mechanism of learning. In this sense, evaluating effectiveness is not a stage that follows organizational development. It is part of organizational development itself. When organizations seriously examine the impact of the interventions they undertake, they are not only measuring results. They are learning something deeper about how their organization changes.
For organizational consultants this insight also changes the nature of professional practice. The task is not only to design effective interventions. It is also to help organizations develop the capacity to understand the change they are creating. To build a shared language around organizational impact. And to create learning mechanisms that allow the organization to improve over time. Perhaps this is the most useful way to think about the effectiveness of organizational development. Not as a technical question of measurement, but as a systemic question.
How do organizations learn to understand themselves?
Organizations that attempt to measure organizational development as if they were measuring a machine often become disappointed by the results. Organizations that treat evaluation as part of their learning process often discover something surprising. They become better not only at organizational development. They become better at management itself. Perhaps this is why the most important question is not whether organizational development can be measured perfectly. The more important question is whether the organization knows how to learn from the change it creates.
If the answer is yes, the most meaningful indicator of organizational development effectiveness may not be a number in a table. It may be the organization’s growing ability to become a truly learning organization.
References
Burke, W. W. (2017). Organization change: Theory and practice (5th ed.). SAGE Publications.
Cummings, T. G., & Worley, C. G. (2015). Organization development and change (10th ed.). Cengage Learning.
Kirkpatrick, D. L., & Kirkpatrick, J. D. (2006). Evaluating training programs: The four levels (3rd ed.). Berrett-Koehler Publishers.
Patton, M. Q. (2011). Developmental evaluation: Applying complexity concepts to enhance innovation and use. Guilford Press.
Weiss, C. H. (1998). Evaluation: Methods for studying programs and policies (2nd ed.). Prentice Hall.
Arthur, W., Jr., Bennett, W., Jr., Edens, P. S., & Bell, S. T. (2003). Effectiveness of training in organizations: A meta-analysis of design and evaluation features. Journal of Applied Psychology, 88(2), 234–245.
Di Pofi, J. A. (2002). Organizational diagnostics: Integrating qualitative and quantitative methodology. Journal of Organizational Change Management, 15(2), 156–168.
Dryzin-Amit, Y. (2025). OD consulting as a catalyst for change: Adaptive spaces, emergent capabilities, and systemic integration in practice. The Journal of Applied Behavioral Science. Advance online publication.
Nielsen, K., & Miraglia, M. (2016). What works for whom in which circumstances? On the need to move beyond the ‘what works?’ question in organizational intervention research. Human Relations, 70(1), 40–62.
Nielsen, K., & Randall, R. (2013). Opening the black box: Presenting a model for evaluating organizational-level interventions. European Journal of Work and Organizational Psychology, 22(5), 601–617.
Schleicher, D. J., Baumann, H. M., Sullivan, D. W., & Yim, J. (2020). Evaluating the effectiveness of performance management: A 30-year integrative conceptual review. Journal of Applied Psychology, 104(7), 851–887.
Weisbord, M. R. (1976). Organizational diagnosis: Six places to look for trouble with or without a theory. Group & Organization Studies, 1(4), 430–447.

Comments