José A. Cárdenas, Ed.D.
The following article was written by Dr. José A. Cárdenas around the year 1964, when he was serving as chairman of the Education Department at St. Mary’s University. It was first published in 1972 and is included in his new reference book, Multicultural Education: A Generation of Advocacy published by Ginn Press.
Although written more than 30 years ago, the caveats raised about invalidity of intelligence testing for linguistically and culturally different children have never been addressed. There have been no further inquiries into the administration, performance and interpretation problems identified by the author in 1972. On the contrary, current literature about ethnic differences in mental abilities inferred from the results of IQ tests is being used for educational policy development, without regard to the problems identified in this article.
Dr. Cárdenas’ early experiences with IQ testing of language minority, limited-English-proficient and bilingual students is a direct contradiction to Richard J. Hernstein and Charles Murray’s assertion in their recent book, The Bell Curve, that there are no cultural biases in intelligence tests.
The past few years have seen increased concern over the testing of intelligence of minority children and particularly of the assessment of mental abilities of non-English speaking or bilingual children. Various national, regional and local studies have ascertained that bilingual children are over-represented in classes for the mentally retarded, and, in some cases, the traditional underachievement characterizing minority children in the public schools has been rationalized on the basis of below normal mental abilities.
The unfair practice of administration of invalid intelligence tests to bilingual and bicultural populations has been noted and addressed by the courts and various civil rights agencies. In general, both the courts and regulatory agencies have understood at least some of the reasons for the lack of test validity and have consistently ruled against the use of language incompatible testing.
However, the remedy formulated by the courts, often at the insistence of plaintiffs, has resulted in equally discriminatory or in some cases, even more discriminatory testing practices.
Courts have consistently ruled the use of English intelligence tests to be unfair to children of limited English speaking ability but have then ruled that intelligence testing must be conducted in the language spoken in the child’s home. Such a response has not proved to be an ideal solution to the problem, and in most cases, has resulted in worse testing practices than those being replaced.
Assumptions in Intelligence Testing
Understanding why such responses are dysfunctional requires an understanding of the rationale and methodology utilized in intelligence testing. In general, intelligence testing is based on the following four assumptions.
- Intelligence, being an intangible, cannot be measured directly, therefore it must be measured indirectly and on the way intelligence influences certain behaviors. An intelligence test item is a situation in which the behavior of the testee is dependent on his or her mental abilities.
- The test itself is a series of situations which represent ways in which intelligence is utilized. The test items are samples of activities influenced by itelligence. For example, it is assumed that a person’s vocabulary is influenced by his or her intelligence. An individual’s mental abilities determines how many and which (quantity and quality) words he or she understands.
- Since it is difficult and time consuming to determine all the words an individual knows, a sampling of words is used, and the ability of an individual to understand words on this list is then generalized to estimate his or her entire vocabulary.
- It is assumed that the individual has been exposed to certain common experiences and that his or her knowledge is not dependent on his exposure to experiences, but rather to the amount gained or retained from these experiences.
In the vocabulary example used above, it is assumed that each of the words presented in the sample are words commonly perceived by the testee. If the testee fails to master the word it is because, in spite of having encountered the word, the testee is intellectually unable to conceptualize it or his or her ability to retain the concept is lacking.
- It is assumed that the testee has all the necessary skills and competencies necessary for responding to the test situation; the only variable is the level of mental functioning. If, in a test situation, a testee is required to write an answer, it is assumed that the testee knows how to write and that the ability to manipulate a pencil does not influence his or her behavior.
Problems with Testing Assumptions
The four assumptions listed above immediately ascertain the invalidity of intelligence tests for persons who have atypical language, cultural and socio-economic characteristics. In fact, the invalidity is so clear that one wonders about school personnel who persist in the utilization of these tests when it is clear even to lay judges, administrators and community groups that the tests are biased, unfair and invalid.
Testing the intelligence of a Spanish-speaking youngster through a sampling of English words penalizes the testee since he or she may not have had the opportunity to hear the word in English and the test does not measure any vocabulary that he or she has acquired in Spanish.
The assumption that the testee has been exposed to experiences basic to test activities similarly leads to invalidity. For the most part, experiences utilized in intelligence test items are taken from typical White, Anglo Saxon, English-speaking, middle-class situations. Test critics go further and claim that the test items are biased in favor of Northeast, urban populations.
For example, one test item requires that a child associate a hill of snow with the type of vehicle used for transportation on this snow. A child from Key West, Florida; Brownsville, Texas; or San Diego, California, may not have the experience of sliding down a snowy hill which the test assumes everybody has, so that his or her failure in the item may be attributed to this lack of experience rather than the low level of intelligence implied by the test.
Culturally different children experience the same failure due to not having the experiences assumed by the test items rather than to lack of intelligence.
When an intelligence tester asks a Mexican American or Puerto Rican child, “What would you do if your mother sends you to the store to buy a loaf of bread and the grocer does not have any?,” it is assumed that the child is acquainted with the concept of bread among other things. If the child is better acquainted with home-produced flour tortillas or tostones and does not know how to react to the problem situation, it is dangerous to assume a low intellect.
Intelligence tests often require special skills and abilities commonly acquired at the age or grade level at which the test is administered. A fifth grade intelligence test may require third grade reading skills. The tester assumes that a fifth grader can read at least at the third grade level. However, if the fifth grade student was academically retarded because he or she did not learn to read in the first grade due to his or her having to develop fluency in the English language and subsequently did not possess third grade reading skills at the fifth grade level, the assumption that he possesses the necessary skills is false, and the test item, and subsequently the test, and the score(s) produced are invalid.
Problems with Spanish Language Intelligence Tests
As stated previously, courts, unlike educators, have not experienced difficulty in understanding the reasons for the lack of validity tests developed for White, Anglo Saxon, English-speaking, middle-class populations when applied to non-White, non-Anglo Saxon, non-English speaking or non-middle class populations.
However, the remedy implemented by the courts has frequently been equally dysfunctional and invalid.
In one case involving Mexican American children, the court addressed the administration of English language intelligence tests to be replaced by the administration of Spanish language intelligence tests. Most likely, the results were disastrous.
In the first place, there are no Spanish language intelligence tests developed for or standardized for Mexican American children. In the second place, language is not the only invalid characteristic of intelligence tests used for minority populations.
In order to illustrate the ramification and complexity of the problem, I will draw from my experience in the measurement of mental abilities of Mexican American children. Using a very simple test of mental abilities in order to avoid the complexities of analyzing tests such as the Wechsler or Binet which require and assume much more sophisticated testee skills and experiences, I did extensive testing of Mexican American elementary school children using the Peabody Picture Vocabulary Inventory.
The Peabody utilizes a simple rationale and methodology. The testee is presented a test kit which has been divided into four compartments. Each compartment contains a picture depicting either a simple object at the lower levels of the test or an activity or some complex concept at higher levels.
The tester gives an oral stimulus word, and the testee is to indicate which of the four pictures depicts the stimulus word. For example, a plate may depict a butterfly, a bird, a baseball bat and an elephant. When the testor says, “Show me the butterfly,” the child is expected to point to the picture of the butterfly. Assuming that he or she has experienced the objects depicted, it is assumed that the response to the stimulus word is dependent on, and solely on, his or her mental abilities. It is assumed that the child has seen a butterfly and that he or she has previously heard and perhaps used the word butterfly.
The fallacy of the assumptions mentioned above holds true in this test situation. It is not only possible, but extremely common, that a child from a Spanish-speaking home has never heard this insect being referred to as a butterfly. Although he or she may have heard it referred to by a Spanish word – which incidentally in no way resembles the phonetic elements of butterfly (mariposa) – and may be able to identify the picture if the stimulus word were to be presented in Spanish, the child’s failure to respond correctly assumes a low level of mental ability. Incidentally, many children from Spanish-speaking environments who are highly fluent in English have never heard the test words in the English language.
The opposite of this situation is also true. Mexican American children who are fluent in Spanish frequently have never heard the Spanish equivalent of some English words either because there is no commonly utilized Spanish language equivalent or the concept is extraneous to the racial, ethnic or socio-economic culture of the child. For instance, I have never heard a commonly used Spanish language equivalent for the English language words marshmallow, cream puff, hot dog, or bush.
For bilingual children, the validity of Spanish-language testing depreciates tremendously. The bilingual child by definition is one who has fluency in two languages. Testing in English does not reach the vocabulary content the child may possess in Spanish; testing in Spanish does not sample the child’s English vocabulary. Similarly, English sampling does not identify words associated with a child’s Mexican (Spanish, Indian, Hispanic, Latin) culturally related concepts; Spanish sampling does not identify words associated with a child’s English (American) culturally related concepts.
Problems with Translation
Many attempts have been made to validate intelligence tests through translations. For the most part, such attempts have proved fruitless. I have seen, at some time or another, at least a dozen attempts to translate the Peabody test. The following example illustrates the reason for the failure of translations to validate intelligence measures.
1. Language Competency of Translators
The Spanish language competency of some translators have left a lot to be desired. In one do-it-yourself translation of the Peabody test called to my attention, the stimulus word hot dog has been translated to “un perro caliente” which at best means a dog which is warm and at worst means a dog in heat.
2. Dialectic Differences
Translators have a difficult time identifying dialectic characteristics of the second language, often peculiar to an area or region in which the translated test is to be utilized. In the administration of the Peabody, the writer had translated the stimulus word tree into the Spanish arbol. In one school, almost every Spanish-speaking student failed the test item. After the test administration, I asked a child, “What is that?,” while pointing to the tree. The child replied, “Es un palo.” I subsequently found that in that area the word “arbol” was never used; “palo” was the accepted terminology.
As we have seen previously, the assumption that the testee was acquainted with the word and the concept did not hold true, therefore, the item was invalid.
3. Maintaining Levels of Difficulty
A third problem encountered in the translation of tests is retention of the level of difficulty of a test item. In the development of a test, the items must possess a certain level of difficulty to distinguish between age or grade levels. A test item using a stimulus word must ascertain the word that is commonly known by the members of an age group (such as eight year olds, but not commonly known by seven year olds). If an eight year old does not know the word, it is assumed that he or she has inferior intelligence. If an eight year old knows the word, he or she is assumed to be of av