| Objectives:
-
Test scores should accurately represent student learning related
to the specific set of content and skill areas covered by the
test.
-
Test scores should not be influenced by a student’s inadequate
test-taking skills or limited familiarity with the item formats
used on the test.
-
Test scores should allow the user to make an accurate inference
regarding student learning related to the larger domain of content
and skill areas (i.e., beyond the specific questions on the test).
To help illustrate these three objectives, consider the following
figure. The circle on the left represents the full domain (or set)
of skills that define a given curricular area the test was designed
to measure, such as science. A test that is constructed to measure
a student’s attainment of these skills, however, very rarely
is able to include questions that completely represent the full
domain—due to time constraints and the format of the questions.
Instead, a test consists of only a sample of the skills representing
this domain, depicted by the shaded circles in this figure. There
are often other important skills (depicted by the white circles),
such as writing a science lab report, that tests like the Iowa Tests are not able to assess. Even for those areas that are covered by
the test (i.e., the shaded circles), the actual questions included
represent a very small sample of the questions that potentially
could have been asked.

The first objective associated with score meaning
& use, is to have a student’s score be an accurate
representation of what the student knows and is able to do in the
specific content and skill areas covered by the questions on the
test—the shaded circles in this figure. One way to help ensure
that this objective can be achieved is to make sure that the student
is familiar with the format of the items used on the test, as well
as other critical test-taking skills (objective #2). Describing
student achievement as it relates to a particular set of questions
on a test, however, is not very informative. Instead, nearly always
you want to be able to make accurate inferences regarding a student’s
learning related to the larger, full domain of content and skill
areas (objective #3).
Negative consequences associated with having scores that
are higher than they should be include the following (in addition
to consequences cited for violations of academic ethics):
-
lost instructional assistance for students because of inaccurate
scores (i.e., students lose out on additional help because their
test scores indicate they’re doing OK),
-
interference with identifying areas of the curriculum/instruction
that need improvement,
-
inability to use the data to help make correct decisions regarding
the effectiveness of a particular type of instructional intervention
(if scores are high enough, it’s assumed the intervention
worked), and
-
inability to make meaningful/accurate comparisons across students,
classes, or schools (fairness/equity issue) for a given year and/or
across time.
Anytime actions taken by a teacher and/or administrator
contribute to test scores that do not represent student learning
accurately, there is the potential that these actions have directly
contributed to the misrepresentation of information. Misrepresentation
of student achievement leads to incorrect decision making, and is
also considered unethical.
| Q: |
But, isn’t test preparation for
accountability testing essential so that students will score
just as high as they can? |
According to the guidance provided by Iowa Testing
Programs on the development of district policy regarding test
use, test preparation, and test security as it relates to the
Iowa Tests (Iowa Testing Programs, August 2005):
Let’s now look at a scenario illustrating the
use of practice tests and consider in what ways this practice
might result in compromising the meaning of the resulting test scores.
(Duration: 40 sec., File size: 1 MB)
Mrs. Thompson typically
uses last year’s Advanced Placement (AP) exam to prepare
her students for the exam they will be taking in the spring.
She also uses ACT practice tests for the same purpose. Therefore,
she thought nothing of taking questions from an old ITED and
using them to practice certain elementary areas her advanced
students hadn’t been exposed to for a few years. Prior
to using the questions, just to be on the safe side, she modified
them so that they were not exactly the same as the originals.
For a week prior to the test, Mrs. Thompson used ten of these “modified questions” as warm-up activities. If any
of the questions proved to be trouble areas, she conducted mini-review
lessons with her students. |
To determine how this practice might impact the meaning
of the resulting scores, it’s very helpful to once again consider
the guidance provided by Iowa Testing Programs (August 2005).
To begin, what if Mrs. Thompson decided to use questions
from Form A or Form B? That is,
| Q: |
Is it ever appropriate to use the actual
test forms (those used in the current or subsequent year)
for test preparation? |
What if Mrs. Thompson used test questions from old
forms of the ITED—forms that are no longer being
used in Iowa—without modifying them? Beyond the issue
of violating copyright,
| Q: |
Is it ever appropriate to use previous
forms of the assessment (e.g., Forms K and L of the Iowa Tests)
for preparation purposes? |
Given Mrs. Thompson didn’t use existing ITED questions,
but modified them before using them—does that make it OK?
According to the guidance from Iowa Testing Programs:
| Q: |
Is it ever appropriate to develop practice
tests locally that are similar in content or format to the actual
test forms currently in use? |
Sometimes the question of how similar a question used for practice
can be to a question that is on the ITBS or ITED
(on current or “old” forms) before it is “too
similar,” is difficult to answer. The simplest advice
that can be given regarding this issue is that if students practice
with questions that are modeled after the specific skills and
content areas depicted in ITBS/ITED test questions,
then the practice questions are too similar and the
use of this practice is inappropriate. This type of
targeted practice results in the inability to make accurate
inferences regarding student learning related to the larger
domain of content and skill areas.
| Time for reflection and/or interaction:
What was Mrs. Thompson trying to accomplish with her test-preparation activity?
How could she accomplish this purpose in a more acceptable way? |
Let’s now turn to a scenario illustrating the review of
tested content/skills and consider in what ways this practice
might result in compromising the meaning of the resulting test scores.
(Duration: 40 sec., File size: 0.9 MB)
Teachers and administrators at Southwest
Elementary are concerned with their students’ low reading
scores and they are anxious about how well their students will
perform when taking the ITBS in November. So, they’ve
decided that for the month of October all teachers will spend
10 minutes during the first morning period working on reading
passages with students—focusing specifically on inferential
types of questions. The passages that have been collected for
this practice are from a wide variety of sources, including
some that were written by teachers, but none had been taken
from the ITBS. The questions are almost exclusively
in multiple-choice format because the teachers believe that
it’s important to give their students experience in answering
these types of questions. |
| Q: |
Is it appropriate to provide students
with a review of content covered by the test (in this case,
inferential understanding) as a form of test preparation?
According to the guidance from Iowa Testing Programs (August,
2005): |
In the Southwest Elementary scenario, teachers were providing focused
instruction on inferential understanding of various sources of text.
This skill area is a critical component of reading and would likely
be included in the district’s standards as well as on most
tests designed to measure reading comprehension. However, the timing
of this additional assistance is suspect. By implementing this focused
instruction the month before the test is to be administered, it
appears as if the intent is to raise scores rather than to foster
the students’ long-term retention of this important skill.
This is the sort of instruction that would be most beneficial if
delivered throughout the year.
What about the fact that the questions were almost exclusively
in multiple-choice format? Is this OK? According to the guidance
from Iowa Testing Programs (August, 2005):
The appropriateness of any proposed practice should meet either
of the two following standards:
-
It will promote the learning and retention of important knowledge
and content skills that students are expected to learn.
-
It will decrease the chance that students will score lower
on the test than they should due to inadequate test-taking skills
or limited familiarity with the item formats used on the test.
Activities that do not meet one or the other of these criteria
are more likely to be unethical, to promote only temporary learning,
or to waste instructional time. (p. 4)
If teachers at Southwest Elementary rarely use multiple-choice
questions to assess their students’ understanding of what
they’ve read, students might be unfamiliar with how best to
answer these types of questions and would likely benefit by having
some practice so that their scores are a more accurate reflection
of how well they understand what they have read. An over reliance
on multiple-choice questions on classroom assessments, however,
can restrict the type of information that teachers can obtain through
their assessment process. This type of restriction often results
in some important achievement targets - as defined by the standards
and benchmarks - not being assessed.
| Time for reflection and/or
interaction: What might teachers at Southwest
Elementary do more appropriately to build students’
inferential understanding? |
|