Universal Screening Within a Response-to-Intervention Model

The purpose of this article is to discuss the component of universal screening within a Response-to-Intervention (RTI) model. The goal of this article is to assist the reader in making informed decisions about the nature of universal screening measures. To that end, the article is divided into the following sections:

  1. What is universal screening?
  2. What are the elements of effective universal screening measures?
  3. What are some common universal screening measures?
  4. What types of performance are measured?
  5. What universal screening measures were used in the RTI models in our research review for the RTI Action Network?
  6. How is at risk status defined?
  7. When does Tier 2 begin?
  8. Conclusions and directions for future research

What Is Universal Screening?

In the context of an RTI prevention model, universal screening is the first step in identifying the students who are at risk for learning difficulties. It is the mechanism for targeting students who struggle to learn when provided a scientific, evidence-based general education (Jenkins, Hudson, & Johnson, 2007). Universal screening is typically conducted three times per school year, in the fall, winter, and spring. Universal screening measures consist of brief assessments focused on target skills (e.g., phonological awareness) that are highly predictive of future outcomes (Jenkins, 2003).

Although most research on universal screening is in the area of reading, there is also research support for the utility of universal screening in the areas of writing, math, and behavior (Fuchs et al., 2007; Jenkins et al., 2007). In a typical RTI model, all students are screened in one or more of these academic areas and those identified as at risk for learning or behavior difficulties are provided evidence-based interventions in the at-risk area. Fuchs et al. (2007) recommend identifying the "risk pool" early (e.g., kindergarten, 1st grade) to allow participation in prevention services before the onset of substantial academic deficits. The goal of early identification of potential problems is to increase the likelihood of at-risk students developing adequate academic competence. However, screening students in the early grades lends itself to at least two common errors: false positives and false negatives. False positives occur when students are deemed at risk when, in fact, they are not. False negatives are cases in which students who are deemed not at risk then go on to perform poorly on a future criterion measure (Jenkins, 2003). "For a prevention system to work effectively, procedures for determining risk must yield a high percentage of true positives while identifying a manageable risk pool by limiting false positives” (Fuchs et al., 2007, p. 312).

Elements of Effective Universal Screening Measures

Taking the common errors (e.g., false positives and false negatives) into consideration, Jenkins (2003) identified four elements of the most effective universal screening measures: a) sensitivity, b) specificity, c) practicality, and d) consequential validity. A screening measure with these four elements will increase the likelihood of true positives and decrease the likelihood of false positives.

Sensitivity. Sensitivity refers to the degree to which a screening mechanism reliably identifies at-risk students who, in fact, perform unsatisfactorily on a future criterion measure (Jenkins et al., 2007). These students are referred to as true positives, those who truly are at risk for future academic difficulties. A screening measure with good sensitivity will also help reduce the numbers of false negatives. This is critical in an RTI model so that all students needing extra assistance receive it.

Specificity. Specificity refers to the degree to which a screening mechanism accurately identifies students who later perform satisfactorily on a criterion measure (Jenkins, 2003). These students are referred to as true negatives, those who truly are not at risk for future academic difficulties. A screening measure with good specificity will also help reduce the numbers of false positives. This is critical in an RTI model because false positives lead to a waste of time and money, and may result in inappropriate instruction for students who don't need it.

Practicality. An effective screening measure should also be brief and simple. An efficient screening measure will quickly identify students who are lagging behind their peers, thereby maximizing instructional time (Hall, 2008). The screening measure should also be simple enough to be implemented on a wide scale, by normal people under normal circumstances (Jenkins, 2003). A simple screening measure does not require a specialist (e.g., school psychologist) for administration and can be performed in the classroom. This may help to reduce student test anxiety and results can be computed and interpreted in a quick and efficient manner (Boardman & Vaughn, 2007).

Consequential Validity. Effective universal screening measures should also be consequentially valid. This means the screening measure does no harm to the student (e.g., avoids inequitable treatment) and is linked to effective interventions (Messick, 1989).

After finding universal screening measures containing these effective elements, it is also important to use them consistently. Hall (2008) cautioned that if a school changes assessment measures during the course of the school year, consequences may include a) loss of a comparable set of baseline data, b) substantial duplication of time to retrain teachers on a second screening assessment, c) confusion for students in becoming familiar with new testing routines, and d) mixed signals to teachers about assessment. These potential problems highlight the importance of selecting an effective universal screening measure the first time. Fortunately, several screening measures are commonly used and have been examined in research on RTI programs.

Common Universal Screening Measures

Several universal screening measures have been examined in the context of an RTI model. Although by no means an exhaustive list, the most common universal screening measures are a) curriculum-based measurement (CBM; e.g., Fuchs & Fuchs, 2005; Salvia, Ysseldyke, & Bolt, 2007), b) Dynamic Indicators of Basic Early Literacy Skills (DIBELS; e.g., Good, Simmons, & Kame'enui, 2001; Salvia et al., 2007), c) subtests of the Woodcock Reading Mastery Test–Revised (WRMT-R; Woodcock, 1987) and the Woodcock-Johnson–Revised (WJ-R; Woodcock & Johnson, 1989), and d) the Texas Primary Reading Inventory (TPRI; Texas Education Agency, 1998; Vaughn, Linan-Thompson, & Hickman, 2003). As most of the RTI research is focused in the area of reading, most of the universal screening measures are also focused in the area of reading. However, at least one screening measure (CBM) can be used for reading, writing, and math. When behavior is included in an RTI model, school or local norms for behavior rates are used as the screening measure for at-risk status.

Types of Performance Measured

In each of the common universal screening measures for academics, the type of performance measured is either accuracy or fluency (Jenkins, 2003). Accuracy distinguishes students according to the percentage of correct responses on tasks and can reveal individual differences in knowledge. Fluency distinguishes students by number of correct responses per minute and can reveal individual differences both in knowledge and speed of processing.

Universal Screens Used in Field Studies of RTI

In our research review of RTI programs, Field Studies of RTI Programs, all but one study mentioned using universal screening. However, the specifics of how often a screening measure was administered and the determination for at-risk status based on performance were not reported and there was often insufficient detail to establish how the data or cut-scores were used. Table 1 provides information (e.g., type of measure, frequency, etc.) on universal screening measures used in each of the 11 studies found in our reviews of RTI field studies.

Table 1: Programmatic Field Studies of RTI
(Click author to view field study)

Universal Screening Mentioned? Type How Often At-Risk Determination

et al.

SPMM Yes CBM Math (multiplication, addition, subtraction.) 1 probe each for multiplication, addition, and subtraction N/R
et al.
SCRED Yes CBM 3 times (fall, winter, spring) N/R
RBM Yes DIBELS; CBM Math; CBM Writing 3 times (fall, winter, spring) N/R

et al.

BSM Yes School-wide behavior norms N/R N/R
et al.
et al.
MPSM Yes "Global screening data" N/R N/R
et al.
TRI Yes WRMT-R/Normative Update CBM N/R N/R
et al.
FSDS Yes CBM and DIBELS 3 times (fall, winter, spring) <10th percentile="" td="">
et al.
Heyden et al.
STEEP Yes CBM 1 time N/R
et al.
EGM Yes TPRI 3 times (fall, winter, spring) <25th percentile="" td="">
*Click author name to view field study.
**Model Name
SPMM - Standard-protocol mathematics model
SCRED - St. Croix River education district model
RBM - Idaho results-based model
BSM - Behavior support model
IST - Pennsylvania instructional support teams
MPSM - Minneapolis problem-solving model
TRI - Tiers of reading intervention
FSDS - Illinois flexible service delivery system model
IBA - Ohio intervention-based assessment
STEEP - System to enhance educational performance
EGM - Exit group model

How Are At-Risk Students Identified?

At present, there is no clear consensus on which criteria (e.g., cut-scores, percentile ranks) should be used for identifying students who are at risk in Tier 1 of an RTI model (McMaster & Wagner, 2007). Using a relative normative approach, some researchers establish a percentile criterion (e.g., on the WRMT-R, WJ-R, or TPRI) whereby students performing below criterion are considered at risk (Hintze, 2007). For example, all students scoring below the 25th percentile may be considered at risk. According to Torgesen (2000), a "potential problem with such a normative approach is that, by definition, there will always be students who fall in the lowest quartile and thus will always appear to be at risk, regardless of their performance level" (p. 59).

Absolute performance levels or benchmarks (e.g., DIBELS, CBM) can also be used to determine risk status. For example, 3rd-grade students who read fewer than 70 words correct per minute at the beginning of the school year may be considered to be at risk. "Benchmarks may be based on national or local data, and can be determined by using inferential statistics to calculate scores that predict later success, such as meeting end-of-year academic standards or passing state mandated tests" (Hintze, 2007, p. 11).

In addition to cut-scores for normative and benchmark approaches, performance standards for severity of academic difficulty and level of risk have been used in research on screening measures. According to a review by Jenkins et al. (2007, p. 585), reading severity can be unsatisfactory or very unsatisfactory, while level of risk can be some or high. The use of these additional criteria greatly affects the proportion of students identified as at risk.

Severity criterion as unsatisfactory. Jenkins et al. (2007) reported that most screens focus on predicting unsatisfactory outcomes, where "unsatisfactory is defined as performing below a standard such as ‘more than one-half year below grade level,' ‘below the 25th percentile,' or ‘below a high standard' on a state test" (p. 585). Depending on how it is defined, this unsatisfactory criterion has the capacity to identify a large number of at-risk students, as much as 50%–75% of students in some school districts.

Severity criterion as very unsatisfactory. When the goal of the universal screening measure is to find the students with the most severe academic deficits, very unsatisfactory appears to be the better criterion. This criterion finds the lowest performers, those suspected of having a learning disability. For example, if the criterion for very unsatisfactory is <10th percentile="" a="" much="" smaller="" pool="" of="" at-risk="" students="" will="" be="" identified="" compared="" to="" 25th="" for="" em="">unsatisfactory.

Levels of risk. Universal screening measures often specify a level of risk for failing to meet a later criterion. For example, a screening measure could classify a student as at some risk or as at high risk for not meeting the standard. Jenkins et al. (2007) used the example of the DIBELS Initial Sound Fluency measure as an illustration of how different proportions of students fall into the two risk categories. Out of a sample of mid-kindergartners, 47% fell into the at some risk category, whereas only 7% fell into the at high risk category.

When Does Tier 2 Begin?

Once a student has been designated at risk by one or more screening measures, the next step is to establish when more intensive Tier 2 interventions will begin. Two methods have emerged from the literature: direct and progress monitoring (Jenkins et al., 2007).

Direct. In the direct method, results of a one-time universal screening measure determine Tier 2 status. For example, in the work of Vellutino et al. (1996) and VanDerHeyden, Witt, and Gilbertson (2007), students designated at risk by a screening measure all received Tier 2 interventions immediately. In both studies, the rationale for this decision was that at-risk students should not be delayed in receiving interventions due to further observation and progress monitoring. A limitation to this method is that it assumes a high level of accuracy for identifying true positives, based on one administration of the screening measure.

Progress monitoring. In the progress-monitoring method, all at-risk students (determined by screening measures) are monitored for an additional amount of time before they receive Tier 2 interventions. Because entry into Tier 2 is determined by dual-discrepancy (e.g., performance level and slope of progress), subsequent progress-monitoring measures are needed to compare to the classroom average. This method provides more reliable assessment of progress than a "one-shot" assessment; however, it delays interventions for students in need of the most help. Length of follow-up progress monitoring varies in the literature. For instance, Compton, Fuchs, Fuchs, and Bryant (2006) used weekly progress monitoring for 5 weeks to determine Tier 2 eligibility. In contrast, Speece and Case (2001) used monthly progress monitoring over a 6-month period to determine Tier 2 eligibility.

There is conflicting research evidence as to the preference of the direct or progress-monitoring method. For instance, Compton et al. (2006) found 5–10 weeks of progress monitoring improved overall screening accuracy (e.g., better sensitivity and specificity), whereas Speece (2005) found no improvement with subsequent progress monitoring. With no clear consensus, choice of method is ultimately a local school district preference. "Tolerance of under- or over-identification rates will lead to different choices" based on a school district's resources (Jenkins et al., 2007, p. 599).

Conclusion and Directions for Future Research

Universal screening is paramount in identifying students at risk for academic difficulty in an RTI model. Correct identification of at-risk students is especially important so the right students receive appropriate tiered interventions. Unfortunately, based on the different conventions of cut-scores, severity, and levels of risk, it is very difficult to generalize percentages of at-risk students across measures and samples. This makes comparison of screening measures extremely difficult. As an education professional, it is imperative to understand how different combinations of cut-scores, severity, and risk will affect the number of identified at-risk students.

Additional research efforts and comparisons across screening approaches using common validation criteria are needed to determine the precision of individual measurement tools in identifying at-risk students (Jenkins, 2003). Specifically, additional research is necessary to ensure that the sensitivity, specificity, positive predictive value (accuracy in the students correctly identified as at risk out of all students identified as at risk), and negative predictive value (proportion of students correctly identified as not at risk relative to all students not identified as at risk) of instruments are adequate for determining those who are at risk (Glover & DiPerna, 2007). In addition, more research is needed to investigate the accuracy of screening approaches used in identifying student difficulties in content areas other than reading (e.g., writing, math, and behavior) in the context of an RTI model (Glover & Albers, 2007) and if accuracy is improved with incorporation of additional progress monitoring.


Ardoin, S. P., Witt, J. C., Connell, J. E., & Koenig, J. L. (2005). Application of a three-tiered response to intervention model for instructional planning, decision making, and the identification of children in need of services. Journal of Psychoeducational Assessment, 23, 362–380.

Boardman, A. G., & Vaughn, S. (2007). Response to intervention as a framework for the prevention and identification of learning disabilities: Which comes first, identification or intervention? In J. B. Crockett, M. M. Gerber, & T. J. Landrum (Eds.), Achieving the radical reform of special education: Essays in honor of James M. Kauffman (pp. 15–35). Mahwah, NJ: Erlbaum.

Bollman, K. A., Silberglitt, B., & Gibbons, K. A. (2007). The St. Croix River education district model: Incorporating systems-level organization and a multi-tiered problem-solving process for intervention delivery. In S. R. Jimerson, M. K.

Burns, & A. M. VanDerHeyden (Eds.), Handbook of response to intervention: The science and practice of assessment and intervention (pp. 319–330). New York: Springer.

Callender, W. A. (2007). The Idaho results-based model: Implementing response to intervention statewide. In S. R. Jimerson, M. K. Burns, & A. M. VanDerHeyden (Eds.), Handbook of response to intervention: The science and practice of assessment and intervention (pp. 331–342). New York: Springer.

Compton, D. L., Fuchs, D., Fuchs, L. S., & Bryant, J. D. (2006). Selecting at-risk readers in first grade for early identification: A two-year longitudinal study of decision rules and procedures. Journal of Educational Psychology, 98, 394–409.

Fairbanks, S., Sugai, G., Guardino, D., & Lathrop, M. (2007). Response to intervention: Examining classroom behavior support in second grade. Exceptional Children, 73, 288–310.

Fuchs, D., & Fuchs, L. S. (2005). Responsiveness-to-intervention: A blueprint for practitioners, policymakers, and parents. Teaching Exceptional Children, 38, 57–61.

Fuchs, L. S., Fuchs, D., Compton, D. L., Bryant, J. D., Hamlett, C. L., & Seethaler, P. M. (2007). Mathematics screening and progress monitoring at first grade: Implications for responsiveness to intervention. Exceptional Children, 73, 311–330.

Glover, T. A., & Albers, C. A. (2007). Considerations for evaluating universal screening assessments. Journal of School Psychology, 42, 117–135.

Glover, T. A., & DiPerna, J. C. (2007). Service delivery for response to intervention: Core components and directions for future research. School Psychology Review, 36, 526–540.

Good, R. H., III, Simmons, D. C., & Kame'enui, E. J. (2001). The importance of decision making utility of a continuum of fluency-based indicators of foundational reading skills for third grade high stakes outcomes. Scientific Studies of Reading, 5, 257–288.

Hall, S. L. (2008). Implementing response to intervention: A principal's guide. Thousand Oaks, CA: Corwin Press.

Hintze, J. M. (2007). Conceptual and empirical issues related to developing a response-to-intervention framework. Paper presented at the National Center on Student Progress Monitoring, Washington, DC. Retrieved May 26, 2008, from

Jenkins, J. R. (2003, December). Candidate measures for screening at-risk students. Paper presented at the National Research Center on Learning Disabilities Responsiveness-to-Intervention symposium, Kansas City, MO. Retrieved May 15, 2008, from

Jenkins, J. R., Hudson, R. F., & Johnson, E. S. (2007). Screening for at-risk readers in a response to intervention framework. School Psychology Review, 36, 582–600.

Kovaleski, J. F., Gickling, E. E., Morrow, H., & Swank, H. (1999). High versus low implementation of instructional support teams: A case for maintaining program fidelity. Remedial and Special Education, 20, 170–183.

Marston, D., Muyskens, P., Lau, M., & Canter, A. (2003). Problem-solving model for decision making with high-incidence disabilities: The Minneapolis experience. Learning Disabilities Research & Practice, 18, 187–200.

McMaster, K. L., & Wagner, D. (2007). Monitoring response to general education instruction. In S. R. Jimerson, M. K. Burns, & A. M. VanDerHeyden (Eds.), Handbook of response to intervention: The science and practice of assessment and intervention (pp. 223–233). New York: Springer.

Messick, S. (1989). Validity. In R. L. Linn (Ed.), Educational measurement (3rd ed., pp. 13–103). New York: Macmillan.

O'Connor, R. E., Harty, K. R., & Fulmer, D. (2005). Tiers of intervention in kindergarten through third grade. Journal of Learning Disabilities, 38, 532–538.

Peterson, D. W., Prasse, D. P., Shinn, M. R., & Swerdlik, M. E. (2007). The Illinois flexible service delivery model: A problem-solving model initiative. In S. R. Jimerson, M. K. Burns, & A. M. VanDerHeyden (Eds.), Handbook of response to intervention: The science and practice of assessment and intervention (pp. 300–318). New York: Springer.

Salvia, J., Ysseldyke, J. E., & Bolt, S. (2007). Assessment in special and inclusive education (10th ed.). New York: Houghton Mifflin.

Speece, D. L. (2005). Hitting the moving target known as reading development: Some thoughts on screening children for secondary interventions. Journal of Learning Disabilities, 38, 487-493.

Speece, D. L., & Case, L. P. (2001). Classification in context: An alternative approach to identifying early reading disability. Journal of Educational Psychology, 93, 735–749.

Telzrow, C. F., McNamara, K., & Hollinger, C. L. (2000). Fidelity of problem-solving implementation and relationship to student performance. School Psychology Review, 29, 443–461.

Texas Education Agency. (1998). Texas Primary Reading Inventory (TPRI) . Austin, TX: Author.

Torgesen, J. K. (2000). Individual differences in response to early interventions in reading: The lingering problem of treatment resisters. Learning Disabilities Research & Practice, 15, 55–64.

VanDerHeyden, A. M., Witt, J. C., & Gilbertson, D. (2007). A multi-year evaluation of the effects of a response to intervention (RTI) model on identification of children for special education. Journal of School Psychology, 45, 225–256.

Vaughn, S., Linan-Thompson, S., & Hickman, P. (2003). Response to instruction as a means of identifying students with reading/learning disabilities. Exceptional Children, 69, 391–409.

Vellutino, F. R., Scanlon, D. M., Sipay, E. R., Small, S. G., Chen, R., Pratt, A., & Denckla, M. B. (1996). Cognitive profiles of difficult-to-remediate and readily remediated poor readers: Early intervention as a vehicle for distinguishing between cognitive and experimental deficits as basic causes of specific reading disability. Journal of Educational Psychology, 88, 601–638.

Woodcock, R. W. (1987). Woodcock Reading Mastery Test–Revised. Circle Pines, MN: American Guidance Service.

Woodcock, R. W., & Johnson, M. B. (1989). Woodcock-Johnson Psychoeducational Battery–Revised. Allen, TX: DLM.

Back To Top