• In 2020 we commissioned a four-year external partner (UCL) to help us understand the impacts we are seeing for young people on our programme
  • Having reflected on the first three years of data, they have introduced a new methodology this year to provide a more meaningful benchmark
  • Here, we reflect on the findings so far and what these results mean for us going forward 

As an evidence-led organisation, we want to understand what change happens for children and young people whilst they are on the West London Zone (WLZ) programme. We use a range of monitoring and evaluation approaches to do this, covering both quantitative and qualitative evaluation methods.

In 2020, we commissioned a four-year external evaluation partner - UCL Centre for Education Policy and Equalising Opportunities (CEPEO) and the Helen Hamlyn Centre for Pedagogy (just ‘UCL’ from here onward) - to help us understand what change we are seeing for young people on the programme.

You can read more about the UCL evaluation and its methodology here.

Learning from our approach

In the last two years of the impact analysis, UCL used a ‘propensity score matching’ method to identify a comparison who ‘look’ similar to WLZ children, but who are not receiving the programme. However, a key challenge to making this approach work for WLZ’s context is that the number of potential candidates for this comparison group is small: as a result, the comparison group tends to have fewer areas of risk/ areas in which they need support. Unfortunately, this means benchmarking against the comparison group is unlikely to give us a realistic assessment of what progress WLZ young people are making.

In part as a response to this challenge, this year UCL has introduced a new methodology to try to find a comparison group that is a closer ‘fit’ for children on the WLZ programme. This could provide us with a meaningful benchmark when UCL conducts its final analysis in Autumn 2024.

Introducing a new methodology - discontinuity design analysis

This year UCL has conducted a discontinuity design analysis. Rather than a matching methodology (as above), this takes advantage of the fact that WLZ has a ‘hard’ threshold on attendance, when selecting children for the programme: a child with attendance below 96% is considered ‘at risk’, and this contributes to our assessment of their total number of ‘risks’ when selecting the cohort.

If we imagine two almost identical children, one of whom has 95.9% attendance, and one of whom has 96.1% attendance, the child with 95.9% attendance is more likely to be enrolled on the WLZ programme – even though the difference between the two children (of 0.2% points in attendance) is unlikely to signify a ‘real’ difference between the children and their background characteristics. It could literally represent the bad luck of catching a cold that results in a child missing a single day of school.

The discontinuity analysis compares the outcomes of children who fall just either side of the 96% attendance cut-off (there is some added work to take into account the proportion on either side who were selected onto the programme). We believe this discontinuity method is likely to have a greater chance of showing causal impact than a propensity score matching methodology, because it compares two much more similar groups of children than those in the matched methodology above, where it has been difficult to find a similar enough comparison group to the WLZ cohort.

What’s next

We were aware that this year’s analysis was unlikely to show ‘results’, as UCL had a small sample to work with. That said, it was encouraging from our point of view to see that the analysis found movements in the ‘right’ direction: children selected onto the programme as a result of being just under the 96% cut-off achieved better attainment and socio-emotional outcomes than those above the threshold who were not on the programme.

This year the effects did not meet the threshold for statistical significance: this was in line with our expectations given the sample size. Next year we will repeat the analysis using data from all four of our evaluation cohorts from 2020-2024: with this bigger sample there is a greater likelihood that, if we are having an impact, the effect will be measurable and meet the conditions for statistical significance. 

We’re proud that our pre-/post data over a number of years shows our positive impact on children’s socio-emotional outcomes, but without a robust point of comparison, we cannot say how much of this change was down to us. We hoped when we started our evaluation with UCL that the propensity score matching methodology would provide us with a straightforward answer to this question: although it is frustrating for both us and UCL that this has not been the case, we’re grateful to the research team for developing an alternative method to understand what change we’re driving with the young people on the programme. We look forward to seeing the results this Autumn in UCL’s final report at the end of the evaluation.