Problems with Florida's Science FCAT Test?

Over the past few weeks, I have discovered some major scientific errors in the guidelines that are used to develop questions for the fifth and eighth grade Science FCAT tests.

The Science FCAT is Florida's high stakes test that assesses all the science concepts and information that students should have learned by the end of fifth grade. Schools and districts are subject to financial incentives or penalties, depending on their students' FCAT scores, so this is a VERY important test.

A few weeks ago, I started developing FCAT practice questions to help students review concepts and prepare for the test. To develop those questions I used FLDOE's FCAT 2.0 Science Test Item Specifications. These documents are used as:

"a resource that defines the content and format of the test and test items for item writers and reviewers."

I expected the Test Item Specifications to be a tremendous help in writing simulated FCAT questions. What I found was a collection of poorly written examples, multiple-choice questions where one or more of the wrong responses were actually scientifically correct answers, and definitions that ranged from misleading to totally wrong.

I suggest that you read over the entire document, but here are a few of the problems that I found in the FCAT 2.0 Science Test Item Specifications for grade 5.

  1. A glossary of definitions (Appendix C) is provided for test item writers to indicate the level of understanding expected of fifth grade students. Included in that list is the following definition:

    Predator—An organism that obtains nutrients from other organisms.

    By that definition, cows are predators because they obtain nutrients from plants. The plants are predators too, since they obtain nutrients from decaying remains of other organisms. I have yet to find anyone who thinks that this is a proper definition of a predator.

  2. In the same list we find:

    Germination—The process by which plants begin to grow from seed to spore or from seed to

    There are no plants that grow from seed to spore. The mistakes in these definitions are not technicalities. They are errors that any fourth grade science teacher would catch. How did they make it past scientific review?

  3. Sample Item 2 for SC.5.N.1.6 (page 32), which assesses the following benchmark.

    SC.5.N.1.6: Recognize and explain the difference between personal opinion/interpretation and verified observation.

    This sample question offers the following observations, and asks which is scientifically testable.

    1. The petals of red roses are softer than the petals of yellow roses.
    2. The song of a mockingbird is prettier than the song of a cardinal.
    3. Orange blossoms give off a sweeter smell than gardenia flowers.
    4. Sunflowers with larger petals attract more bees than sunflowers with smaller petals.

    The document indicates that 4 is the correct answer, but answers 1 and 3 are also scientifically testable.

    For answer 1, the Sunshine State Standards list texture as a scientifically testable property in the third grade (SC.3.P.8.3), fourth grade (SC.4.P.8.1), and fifth grade (SC.5.P.8.1), so even the State Standards say it is a scientifically correct answer.

    For answer 3, smell is a matter of chemistry. Give a decent chemist the chemical makeup of the scent of two different flowers, and she will be able to tell you which smells sweeter without ever smelling them.

    While this question has three correct answers, any student that answered 1 or 3 would be graded as getting the question wrong. Why use scientifically correct "wrong" answers instead of using responses that were actually incorrect? Surely someone on the Content Advisory Committee knew enough science to spot this problem.

  4. For this one, you have to go to the document and scroll down to Sample Item 7 for SC.4.E.6.2 (page 42)

    There is nothing in the drawing or written information that indicates if the square object is a streak plate (to test streak) or a glass plate (to test hardness.)  Scratching a glass plate is one of the most common tests for hardness, and it appears as a graphic or photograph in most textbook units on minerals. C would be just as valid an answer as B, but a student that answered C would be graded as giving a wrong answer. This flaw could have easily been avoided by simply not listing hardness as one of the choices.

These are just a few of the problems that I found. I contacted FLDOE's Test Development Center, and sent them a list of the errors. Their response for error after error was:

"This item was reviewed and deemed appropriate by our Content Advisory Committee."

I asked for contact information of someone from the Content Advisory Committee, so I could find out how these errors made it past scientific review. Steve Ash, Executive Director of the Test Development Center, told me that FLDOE would not give out that information.

Even more troubling was their response to the example questions that had more than one correct answer. In response to the example above for SC.5.N.1.6, Christopher Harvey, the Mathematics and Science Coordinator at the Test Development Center told me:

"we need to keep in mind what level of understanding 5th graders are expected to know according to the benchmarks.  We cannot assume they would receive instruction beyond what the benchmark states.  Regarding #1 - While I don't disagree with your science, the benchmarks do not address the hardness or softness of rose petals.  We cannot assume that a student who receives instruction on hardness of minerals would make the connection to other materials.  The Content Advisory committee felt that students would know what flowers were and would view this statement as subjective.  Similarly with option 3, students are not going to know what a gas chromatograph is or how it works.  How a gas chromatograph works is far beyond a 5th grade understanding and is not covered by the benchmarks.  As you stated most Science Supervisors felt that student would not know this property was scientifically testable.  The Content Advisory Committee also felt that 5th graders would view this statement as subjective.  We cannot assume that student saw a TV show or read an article."

The response to my comments on Sample Item 7 for SC.4.E.6.2 (page 42), I was told:

"Here again I don't disagree with your science; however, elementary educators consistently told us that glass plates are not used in elementary classrooms for safety reasons.  They did not feel that 5th graders would be familiar with using glass plates to test hardness."

So according to the Test Development Center, it appears that it is acceptable to use scientifically correct answers for wrong responses on the Science FCAT as long as FLDOE does not expect a fifth grader to be educated enough to realize that the wrong answers are scientifically correct.

I wonder how many students got "wrong" answers on the FCAT because their teachers taught them too much. How many "F" schools would have higher grades if those scientifically correct "wrong" answers were counted as correct answers. How many "B" schools would get the extra funding that "A" schools get, if those scientifically correct "wrong" answers were counted as correct answers?

We may never know the answers to those questions. The Test Item Specifications are the guidelines that are used to write the test questions. If the Science FCAT test is reviewed by the same Content Advisory Committee that reviewed the Test Item Specifications, then it probably has similar errors. But as much as I would LOVE to check the accuracy of the questions from the actual Science FCAT, I can't. Teachers, scientists, and the general public are not allowed to see actual test questions, even after the tests have been graded and the penalties for those grades have been imposed.

In December of 2012, FLDOE finally corrected the errors I found in the Test Item Specifications. I still have many unanswered questions, ranging from questions about errors in the one Science FCAT that was released back in 2007 to information about whether this year's Science FCAT was written using the old version or new version of the Test Item Specifications. You can read more about the continuing struggle in this article from the Miami Herald:

Specs changes?

When I referenced the documents, I see that the definitions mentioned are not the same, and the test questions seem to be different, too. Is there a new version of the spec page?

Yes, the Test Item Specs were

Yes, the Test Item Specs were finally corrected in December, 2012. There are still many unanswered questions, including whether this year's FCAT was written using the old Test Item Specs.

This is an examination of just one test. Can you imagine...

Thank you for your hard work on looking into this. My kids have the unfortunate bad luck of being educated in Florida and I too have always had a problem with the tests... I believe it is a back-woods way at gathering information. Where is the transparency in all of this and how are they to be held accountable if no one other than their right hand mans have access to the actual material! Can you even imagine the stupidity of this all? It really angers me. But the real issue is, how do we cause change to come about? I have tried and tried to get involved n a number of different ways, to no success. The role of the gate-keeper is mighty and strong! Sad...

So, as a 5th grade science teacher at a gifted magnet school I (and my students) am penalized because I teach my students science at a level well above a typical 5th grader's understanding? I am regularly asked to teach my students above grade level, and many do exploration and research on their own. However, using the logic in this story cited from the DOE I am to assume that I should never teach, expect, or be excited when my students are thinking and performing at a high level, we just want average 5th graders with average knowledge.

follow the money

It's mostly Pearson that is responsible for the caliber of the materials - they have their hands in it - one way or another. If you want to know what's wrong - follow the money.

Expectations for FL test takers: no higher order thinking


I recently discovered your blog about concerns over the FL science test---KUDOS to you! I'm shocked at the Department's responses which include (my summarizing here): 'students shouldn't learn beyond the standards' which leads into 'don't use the knowledge (or should I say value) you receive through instruction to make connections to other phenomenon'. The later is practically a description of learning of a higher order, SYNTHESIS--->one of the upper echelons of learning educators strive for.

It is bad enough to place lots of emphasis on a bad test, but to go the step further and run that data through an uncalibrated algorithm called Value Added to use to fire teachers it DISGUSTING and SHAMEFUL.

Should the FL Dept of Ed ask again why there was no public input on the standards simply say, WE WERE TEACHING.

Standard exams are generally

Standard exams are generally horrifically flawed and so they the actual assumption which kids cannot procedure information and principles learned in a single part of study to a different. We all lose a lot of good instructors because they obtain frustrated with all the system and also start its work within other grounds.

Here's a problem with their

Here's a problem with their reasoning:
they say, quote, "We cannot assume they would receive instruction beyond what the benchmark states." and "We cannot assume that student saw a TV show or read an article."
BUT they can not assume a student HASN'T received instruction or watched a tv show either! They must make the wrong answers actually wrong, instead of throwing in a technically right answer that they just assume a student won't know. What if a child has a scientist as a father, who teaches them about his work on the side? What if they just happen to love science and check out movies at the library on many different subjects? What if they have an older sibling they overhear discussing what they learned in their science classes? You can not rule these possibilities out just because you "can't assume they are smart enough to know that the "wrong" answers are actually technically correct". Faulty reasoning!

This is why U.S. schools are in trouble

Look at how many schools "teach the test." They test at the beginning of the year to get an idea of how the kids will do on the benchmarks at the end of the year. No Child Left Behind was great in theory and a nightmare for the U.S. educational system in practice. U.S. schools face challenges that many other countries do not. Teachers have to make sure everyone (even ESL students) are able to pass the benchmarks. ESL students do not have the ability to take the benchmark exams in any language other than English, but they are held to the same criteria of knowing the material and making measurable progress from year to year.

Benchmark exams are horrifically flawed and they make the assumption that kids cannot process information and concepts learned in one area of study to another. We lose so many good teachers because they get frustrated with the system and go to work in other fields.

I see your point. I just had

I see your point. I just had this discussion the other day. The one skill that unequivocally needs hammered/ingrained in kids today is critical thinking. It's not there anymore and thank "leave no child behind" pass the test..go shopping. Fight your state against corp schools. I'd call that child abuse. (disclaimer, any typos, grammatical errors are due to alcohol move along)

this is why u.s. schools are in trouble.

I am a student attending one of brevard's schools, and my parents and I agree that this is exactly why our public schools are in trouble. Why does brevard think that they can teach us stuff that we would only need to know in order to pass? This is not right. We should be taught things that we could use - not only for this year- but for the rest of our lives. Not the stuff that we would only use this year. Please write back, Brevard

What to expect from children

To begin with, I received a clasic British education in Jamaica so I began 1st grade a year earlier than children is the US and begin science classes earlier and advanced math earlier. (I started calculus in my 10th year of school, not counting kindergarten and having skipped grade 6). I also had four siblings ahead of me (who each had skipped one grade or another.

I am now in my mid-fourties and I do remember several conversations I had in or before 5th grade regarding what our books taught. Such as when our solar system was being formed and, "...the gasses came together and became so dense that it exploded into flame and the planets were ejected from the sun and formed their orbit around the sun...."

I did not buy that at that young age for several reasons; "If the planets were all spun out of our rotating sun, why do they not all rotate in the same way? If the explosion kicked them out, would it not have kept the heavy objects behind and ejected the light objects? If the planets orbit around the sun was as a result of this explosion, would not the size of the planets be in relation to their proximity to the sun from first to last? If that was the case then why do we call pluto a planet when it behaves more like an asteroid or a comet and break all the 'planetary rules'? What about 'Planet X', the 10th planet that some books/TV shows say is out there?"

My science teacher gave a response something to the effect of, "those are good questions and there are newer theories than are in this old textbook which answer some of those questions. To begin with, our sun did not make the planets in our solar system but they were made by a similar process that made our sun. Pluto can be called a planet, comet or asteroid depending on how you define each of those but for all intese and purposes, we will call it a planet. As for Planet X, until it is found we will not speculate about it. Some of your other questions, I do not know the answer to but when you have become a great scientist, I hope you come back here and tell me the answers you have found."

Another puzzling one for me was, " the environment changed, it caused some plants and animals evolved special ways of dealing with their new environment so that they can survive...." I did not buy into that one either.

Wouldn't they have had to evolve first so that they can survive once the change occurred? Isn't evolution a long process such that sudden changes would cause extinction? The change would have to occur first followed by the move into the new environment for survival or the evolution does not stop them from living in the old environment so that when the change comes, those that evolved can survive the new environment.

My science teacher responded with something like, "But what would trigger the evolution to begin with? The change in environment has to do this otherwise their will be no need to evolve."

I did not buy that either; why does Adam (not his real name) have eleven fingers? What triggered that? Why do I have a dark mark on the right side of my face that keeps growing? What in the environment triggered that? Why would a sheep be born black? Why does Barry (not his real name) have sickle cell and why is Candace (not her real name) an albino? Can a change in environment become beneficial to us? If the climate suddenly became very cold, would our children be born with fur?

The next response was something to the effect of, "Well, those are valid points. Camile may better survive if she moved to an environment where it was mostly dark because her eyes are more sensitive to light. A black sheep might survive if it moved to a place where all the predators were nocturnal. [She said something about Barry's sickle cell but I cannot remember]. It does not matter to us now which occurred first. The important thing is that without the evolution, these plants and animals would not have survived."

There was so many things I read in my elementary school text books which did not make logical sense to me then and we now know to be false but which our teachers were required to teach us in school.

Regarding physics, I found my eldest sibblings physics book so intriguing that I was always ahead in "physical science" class. I was doing algebra and trigonometry long before others in my class because I was always curious about what my older sibblings were doing.

I would never cheat on my math homework by looking up the answers in the back of the books becaused I was appauled at how many answers were wrong! Every chapter ended with about 20 math quentions and at least three given answers were wrong, more often four or five and in at least one case, eight. In 1st form (about 8th grade equivalent) I always scored poorly on my math homework but scored well on the tests because my 1st form math teacher never checked if the answers in the homework book was right but always did on the tests she would give us.

It is indeed a travesty what is happening to school texts and standardised tests. I am glad my science teacher in elementary schooled allowed me to question, gave careful answers and encouraged me to investigate more. On the other hand,my 1st form math teacher almost made me give up on math. My 2nd form math teacher (about ninth grade) could not understand why a smart kid like me would not put any effort into math. My 3rd form (about 10th grade) math teacher made me fall in love with math again by commending me for getting the right answers even though the book had it wrong and encouraging me to explain how I got to my answers.

I think we can create a great education system. Where necessary, we need to fix the texts, we need to fix the tests and we need to fix the teachers.



I think it's interesting that the clearest logical error in Christopher Harvey's response (confusing "we do not assume X" with "we assume not X"), and the errors in the questions themselves, can both be understood as "fallacies of exhaustive hypotheses". Perhaps the test setters were themselves mistaught basic logical reasoning skills? (I mean that as a serious hypothesis, not just an insult, although of course it can be both ;-)


Ummm so teachers have access to a similar FCAT question pattern, but they can never see the actual questions asked in the formal exam? Anyone knows why it is like that?

I'm from Finland, and we don't usually take standardized tests until we are 18 - 19, when we are about to graduate from a "High school"-equivalent education level, right before university:

For example there isn't any general nationwide ranking of elementary schools, and only about 2% of students go to private schools. Usually teachers plan out the courses and final exams, following some general guidelines about the topic coverage.

These final examinations are publicly available, also the answer keys. This one is this year's "advanced mathematics", (23th March) but too bad I wasn't able to find it English:

Well since it is math, you can maybe guess what they are asking for :) You choose to answer 10 questions out of 15, and maximum is 6 points / question so at toal of 60. The last two are broader and thus if you have extra knowledge, you can score up to 9 points each.

This one is for physics, so much Finnish you won't probably understand that much:, you answer 8 questions out of 13, and last two have maximum of 9 points.

That's crazy...

Only average (average defined by Mr. Harvey and his ilk anyway) schools will ever get an A rating from this. If a school exceeds the standards, it will never be able to achieve an A rating. The further a school goes above the average, the worse their rating will get.

Given the test is supposed to bring about higher educational standards, this is absurd.

Sadly this isn't a new problem

My father was a botanist and a passionate college teacher. When I was in Junior high and high school, back in the early-mid 80's, he and I discovered that my "earth science" and biology textbooks were riddled with errors (and not just in the chapters about evolution, which were very short and included a craven allowance for intelligent design as a legitimate explanation of how things came to be).

I specifically remember one howler of being told by the earth science textbook in Jr. high that the Great Lakes and the Finger Lakes were formed when glaciers left behind piles of till that created dams (when in actuality they were created when the glaciers scoured deep holes in the ground where the lakes now are).

And when, in high school, I prepped for the standardized statewide final exam ("Regents Exams") by studying previous years tests, we discovered that the tests for Regents level Biology often contained at least 1 or 2 errors per test. There was something about xylem and phloem in plants on the test that got the two types of tissue seriously backwards, IIRC.

Fortunately, my dad took it on himself to explain to me the real science. It got to the point with the High school biology textbook that I would basically read the book and then talk to him about what it said in order to learn whether or not what it said was true or not. So thanks to my dad but no thanks to the statewide standardized tests and texts, I emerged from school having gotten a much better science education than most of my peers.

I emerged from high school with respect for my teachers, but no respect whatsoever for the professional "educators" who crafted the state's curriculum.

state testing

I am sure everyone has heard of "those who can do, those can't teach."
Have you heard that those who can't teach become administrators, those who can't administrate, go to the central office, and those who can't cut it there, go to the state doe

It is not just at the K-12 level that Florida has problems

From: "Concerned Faculty"
April 16, 2012

Many of you have already heard about the Computer and Information
Science and Engineering (CISE) department in the College of
Engineering at UF being singled out for cuts and massive
restructuring. The Engineering Dean Cammy Abernathy began
implementing her plan within hours of unveiling it, by giving
provisional notice to at least 7 staff in the department.


CISE is to take a 20% cut, which represents at most a 2% cut across the
board in the College of Engineering. No other engineering department
will be cut.


The department currently has 32 tenure track faculty, approximately
610 undergraduate majors, 400 masters students and 130 PhD students.
It maintains an international research reputation as befits its status
as a research department in the flagship research university of Florida.
It has 2 ACM, 4 IEEE, and 2 AAAS fellows, and 22% of all
faculty in the College of Engineering who have received the
prestigious NSF Career Award belong to the CISE department. The CISE
faculty engages in substantial interdisciplinary research, the
majority of which takes place in collaboration with the College of
Medicine and the College of Liberal Arts and Sciences within and
outside UF. The CISE department accounts for less than 10% of the
College cost, and has the highest revenue/cost ratio of all the
departments in the College.


The sitting ducks:
Approximately 50% of the faculty, most of whom are actively funded
researchers, fellows of professional societies and recipients of
professional honors and awards - and represent the most outspoken
faculty - are slated to remain in the degraded CISE department. They will
expected to handle over 75% of the current teaching obligation of
the department, without any office or computer systems support staff,
or any of the current 80 TAs. This obligation simply cannot be met
- the programs will lose their professional accreditation.
This degraded department, containing the faculty members who are
the most outspoken, will have to be shut down - A DIRECT ATTACK ON TENURE

The disoriented ducks:
The remaining 50% of the faculty, picked by the Dean, will be given a
choice to move to 3 incoherent other departments - Industrial Systems
Engineering, Biomedical Engineering, Electrical and Computer
Engineering, units with different standards and quality measures,
values and culture that they did not agree to when they joined the

The victims:
Over 75% of the CISE (software) students will be moved to ECE (a
hardware-oriented department), including all graduate students.
(Software-oriented occupations expect a 30% growth in the next 10
years, 3 times average. Hardware-oriented occupations expect a 9%
growth, lower than average). The best and brightest of these students
will be dissuaded from entering academic occupations - their view of
the standards and values of academia have already plummeted, having
witnessed this debacle -- AN ATTACK ON THE IMAGE OF TENURE AND


If this formula for dismantling tenure and academic freedom turns out
to be successful, your department could be next.

The Math and Science Coordinator's ignorant response

While I don't expect my students to know what a gas chromatograph is, I do encourage them to seek out advanced knowledge and readily answer any questions about science, even if they're not grade-level benchmark appropriate. Incidentally, we HAVE discusses the full spectrum of light and that scientists can identify a substance by the gases given off when it is burned or the specific color/range of spectrum of light given off when combustion of a unknown substance occurs.

These discussions were in part due to grade-level content that mentioned scientists identify substances through gas chromatography BECAUSE the de facto standard Reading anthology for Title I schools (Imagine It!) tells them that the landers and rovers on Mars burn soil to analyze the light given off and look for organic components. So, in fact, the Mathematics and Science Coordinator was incorrect when he stated they "cannot assume that student saw a TV show or read an article" explaining the idea, as a large portion of the State's students were FORCED to read one and might have a deeper understanding of how a scientist or chemist could identify composition of a substance than they realized even though it is "far beyond a 5th grade understanding." I've included the excerpt below.

The Mystery of Mars
By Sally Ride and Tam O’Shaughnessy
(p. 392 of Imagine It! reading anthology)

"But are the building blocks of life present in the Martian soil? Another experiment looked for organic molecules, the molecules that make up living things. Samples of soil were heated, and instruments watched for gasses that would be released if organic molecules were present. It was a great surprise when none was found. Scientists know that meteorites and interplanetary dust deliver a steady supply of organic molecules to the Martian surface. So even if there are no living organisms, there should still be some organic molecules. Scientists now suspect that they are being destroyed by harsh chemicals in the Martian soil."

For the record, I just reinforced that particular idea last week when a student was walking down the hallway and asked "Can a flame be green?" because they were having a debate. I reminded them that every substance that burns gives off a different type of light and that some give off light that we see as different colors of green. When we got back to the classroom, I handed them a Scott Foresman Science book (another common text in many schools) and had them look back at the chapter which talked about how scientists identify substances. The text reads:

“Scientists sometimes use flame tests to identify a substance. In a flame test, a material is heated to high temperatures in a flame. Different substances will cause the flame to have different colors. When these flames are studies closely with laboratory equipments, the substances can be identified. What color flame does calcium chloride give off?”
The page is complemented with a picture of strontium chrloide (red), barium chloride (yellow), calcium chloride (orange), and potassium chloride (violet) being ignited - “These wires were dipped in different metal salts. When the metal is heated to a high temperature, the color of the flame cane be used to identify the salt.” (Scott Foresman Science p. 385)

Assuming students have limited knowledge and imagination would be a mistake, especially when you know that you encourage publishers to include higher level science ideas through cross-curriculum integration. Pretty early into the year I came to a sample question very similar to his 3rd point concerning scientifically testable statements... I ended up explaining to the students that "I disagree with this question because more than one of these is testable" and giving the ways to test each of the statements. Then, I asked them which one they could "easily" test themselves without expensive or very specialized equipment... THAT was the correct answer. It's sad, but we want them to think scientifically, but answer as a kid would answer, not a scientist. Your highest students might miss those questions though because they can get a notion of or conceive of a way it could be tested even if they don't have 100% of the scientific knowledge to implement the test themselves (or put a name like "gas chromatography" to it).

Robert, I included the references to common texts specifically for you to have more ammunition when addressing this with the State. I realize that many schools are departmentalized and many Science teachers have no knowledge of the common reading curriculum. I do wonder, though, why a Science Coordinator would not be familiar with the common science texts like Scott Foresman. While most schools have been pushing to ignore the text since it is not the curriculum, focusing solely on benchmarks and using the text as a reference, it would be wise to have a good working knowledge of the content that is readily available to most students and not assume that they are limited in their understanding of more advanced science principles.

A lame if charming question:

Okay, this question on page 72 is adorable and I learned about pandas, but how does it show whether children have achieved the benchmark, "Compare and contrast adaptations displayed by animals and plants that enable them to survive in different environments?" Lookit:

Giant pandas live in the mountain forests of China and eat mostly bamboo. The giant panda has a sixth “finger,” while other bears have only five. The sixth finger is a large wrist bone that giant pandas are able to bend and use as a thumb. [who knew!] The picture below shows the paw of a giant panda with six fingers and the paw of another bear with five fingers.
Which of the following statements best explains why the sixth finger helps the giant panda survive in its environment?
A.It helps the giant panda hold the bamboo stalks it feeds on.
B.It helps the giant panda crush the bamboo stalks before it eats them.
C.It allows the giant panda to dig in the mountain forests to hide its food.
D.It allows the giant panda to climb to the tops of mountain forests to find food.

This is a reading comprehension question, not a science question. They just TOLD me a panda has a thumb. I don't have to know anything about adaptations or understand really anything beyond how my own HAND functions; I just have to know what a thumb is and how it works so that I can determine which of the four described actions a thumb would most come in handy for. You do not have to have gone to school at all to answer this question, except inasmuch as you need to have been taught that people who write FCAT questions are literal minded dullards and that therefore it's answer A, not answer D, even though D could be correct if you read the illiterate "climb to the tops of mountain forests" as meaning "climb to the tops of mountain forest bamboo stands" and apply your knowlege of giant pandas and imagine a panda doing what you've seen them do in zoos and on television and then remember from your own experiences as a higher primate how useful a thumb is in climbing. It takes zero scientific chops but some pretty specialized and sophisticated intuitive sense to figure out that while all of that about D is certainly correct, D must not be chosen because the question is written by morons who are practically blowing an airhorn at you with their obvious, "pick me!" wording ("HOLD the bamboo stalks") in A.


I'm pretty excellent at reading comprehension. I am fairly lousy at science. I am actually really excellent at reading comprehension and terrible at science. But:

A is arguably the right answer. There is nothing in the reading portion that tells me it's wrong. A thumb-like appendage could certainly accomplish this.

B is arguably the right answer. There is nothing in the reading portion that tells me it's wrong. Again, it seems to me clearly easier to crush bamboo with a thumb than to do so without a thumb.

C is arguably the right answer. There is nothing in the reading portion that tells me it's wrong. Does a thumb make it easier to dig? Sure it does. You can, I don't know, more easily throw the dirt you've dug.

D is arguably the right answer. There is nothing in the reading portion that tells me it's wrong. The panda lives in mountain forests. Maybe the food is better the higher up they go--there is no evidence to the contrary in the writing provided. Is it easier to climb if you have a thumb? Absolutely.

There is no wrong answer--only 4 right ones. Please tell me why I'm wrong.

You are correct. One of my

You are correct. One of my questions to the Florida Department of Education was asking if pandas were part of the curriculum. Unless you have studied pandas, there is no way to determine which is the correct answer.

Science moms unite!

I am sitting here with my mouth open, because two weeks ago I was reading that "soft petals" question on the DOE website and ranting to my friend about how the question was using terrible science to test 5th graders. She and I both volunteer in public schools doing hands on science and both of us have science degrees. These tests are bad " gotcha" tools that do nothing to test real science knowledge, and will turn kids who are excited about science into frustrated, science-hating robots. The state claims to care about STEM subjects, but this is an example of how politicians, test companies, and others who are hundreds of miles from actual classrooms are grading our children. It makes me so sad for curious, potential young scientists. The most curious students who do extra reading will be punished by overthinking the test. They will give up on science when the test "proves" they are bad at it. Parents and scientists and teachers need to do a PR roadshow with slideshow to talk about this travesty. We need to take science back!

Science Volunteer/Mom thankful for validation

I am sitting here with my mouth open, because two weeks ago I was reading that "soft petals" question on the DOE website and ranting to my friend about how the question was using terrible science to test 5th graders. She and I both volunteer in public schools doing hands on science and both of us have science degrees. These tests are bad " gotcha" tools that do nothing to test real science knowledge, and will turn kids who are excited about science into frustrated, science-hating robots. The state claims to care about STEM subjects, but this is an example of how politicians, test companies, and others who are hundreds of miles from actual classrooms are grading our children. It makes me so sad for curious, potential young scientists. I worry that curious minds who do extra science reading will be the ones punished for their audacity to learn more science, will test poorly by overthinking the test, and will give up on science very early when testing "proves" they aren't good at it. I would really like to get a panel of parents and scientists out on a panel tour, with slideshow, to make the case against this insanity!

It's much worse than you think...

I interviewed with a major textbook publisher for an editorial job (with my MS in Curriculum and Instruction on top of my undergrad in engineering and Physics) and it was one of those wonderful three-ways, where I saw three managers back to back to back. All were women; two had English degrees, the third had a J-school degree. I had to assure them that despite my hard science background, I was fully skilled in desktop publishing and graphic arts.

The job in question was of course as a mathematics textbook editor.

They told me I wasn't what they were looking for, since I did not have either an English or a J-school degree, and they saw no need for a subject matter expert in the field, since their materials were all contracted out and came pre-vetted. The job itself consisted of sticking pretty pictures and graphs alongside the content, and proofing it, since of course scientists can't read or write well.

I pointed out that one main reason I was seeking the job was my experience with their own textbooks being wrong, and having to explain to my students in the classroom that they were not to trust the materials handed to them. I also recalled (this was after I realized I wasn't going to get the job) the time I saw a student teacher break down in tears because she could not for the life of her arrive at the incorrect answer given to a problem in the answer key.

I no longer teach, nor will I, except for $80-$100 an hour as a tutor to kids whose parents value my talents and time.

Then there was my experience writing ninth grade exit exam questions for Physical Science for the state. The rubric asked for taxonomically leveled questions but the curriculum was itself taxonomically leveled, which made the instructions ludicrous. I ignored the instructions and wrote the questions the way they needed to be written. After they adopted every question I wrote (about 40% of the exam), I went to see the state curriculum director to ask what clown wrote the rubric for writing the test. It was he, of course, and he admitted sheepishly that not only was I right, but that nobody in the entire test development process, including himself, had noticed how flamingly stupid it was to expect taxonomically leveled questions on an instructional objective designed taxonomically in the first place. That's why they accepted all my questions and asked for more.

They also asked me why I was wasting my time teaching high school...which was why I went back for my MS in C&I...and the wheels on the bus go round and round.

Then, of course, there was the career working for The Learning Company in educational software...but that too ended badly. No subject matter experts in that domain either.

"All three were women"

Your story wonderfully illustrates the many flaws with the American system of using standardized tests to evaluate students. However, I have one problem with it. In your first paragraph you talk about interviewing with three managers back-to-back-to-back. You give a description of them as having degrees in English and a J-school degree, but just before that description, you erroneously mentioned they were all women. Why did you mention that? Their gender doesn't seem to have any bearing on the rest of that story. In fact, you only ever mention one other person's gender (a man) and it's by passive use of a pronoun, not by actively mentioning it.

I don't like to attribute an act to malice when it's unwarranted so, I'm willing to think that maybe the statement "All three were women" is just left-over from an earlier draft, or it serves some purpose that I'm not seeing.

Debate Class seems more important now

While a personal bias (perceived or real) might not have nothing to do with the real argument being presented, it can make some people less willing to listen and can give an opponent a chance to use an "ad hominem" attack. (Good wikipedia article on this and other logical fallacies.)

To keep the focus on the problem with science class, try to keep them out.

All three were women...

What difference does the reference to women make when the content was right on target with the objective of the article?

We spend so much time being politically correct, we remove the importance of being scientifically correct.

How disappointing...

Yes, I am an educator; and yes, I am a woman...and I am not offended at all.

Kudos to the author of this article, "Problems with Florida's Science FCAT Test!"

Maybe gender bias is subjective

Although the fact may not have been directly pertinent to the story, your point of preferring that "All three were women" should be omitted from the story says more about your bias than the author's.

The author didn't comment on this fact one way or the other, leaving one to infer on their own that "most of the managers are women" or "women will defend unreasonable positions" or maybe even "why aren't there more male managers in the female-dominated educational system"... but certainly none of these positions were stated.

I'm a woman and I get it.

I'm a woman and I get what is being said here. Dismount from your high horse and get rid of the pc attitude that gets us into trouble every time.

i normed this test

I was one of the students who normed the FCAT in its first (well ... zeroeth?) year. It sounds special, but it's not. All public school students in Florida in 1998 or so, who were my age, normed the test.

I remember seeing a lot of questions like the "testability" one, where there were clearly several correct answers, but -- in my mind -- one that they were definitely after. I lucked out and fell into that Venn diagram intersection of knowing that multiple answers were right, yet being savvy enough to pick out the one they wanted.

In subsequent years I remember consistently falling into the 99th percentile. Again, with a test so poorly written, testing well doesn't have anything to do with being smart, knowing information, or knowing how to reason. It has to do with getting into the brain of a Florida state bureaucrat.

I seem to remember that the year that I normed the FCAT I did so in my school's gifted class (where the teacher opened our paleontology unit by explaining that evolution was Satan's lie). Everybody in the room thought the test was pretty stupid, even the teacher. Yet it comes with a lot of very heavy preamble about what an important test it is, in the weeks leading up to it. There are a lot of practice tests (not for the norming, but I did take it in subsequent years) and vague admonitions about Our Future. FCAT becomes an obsession for teachers and students both; tons of valuable class time goes toward it. I had no idea that it was being used as a tool for determining how school funding should be meted out, but if I had known I think I would have been furious.

There's also a writing section of the FCAT (it was called "Florida Writes," apparently, when I took it) which is maybe even worse than the rest of the test. When it's not multiple-choice, suddenly getting inside a bureaucrat's head is much trickier.

Comparatively speaking I blew Florida Writes. Why? Because, at least in my year, they had a creative writing prompt we could select. And naturally I didn't see it as the trap it was; I took it and enjoyed it, and got the equivalent of a C on the test. Florida's government does not reward creativity.

You could've gamed the CRW part, too.

If they still have that "Florida Writes" bunk, and I think they do, then language arts teachers need to spend the months of FCAT prep time reading on the lower end of the Oprah-recommended continuum--_The Secret_ and the _Chicken Soup_ series and their fictional equivalents plus a few hallmark card whimsyverses just in case some hapless child is inspired to write a poim. They should teach FCAT test prep like they teach foreign languages. It should be like immersion classes--five hours a day meditating and trying to channel the test writers' sluggish, _Who Moved My Cheese_ thought processes so that the children will learn to "think in FCAT" would do them a lot more good than actually trying to teach them science. Or maybe separate the class time exactly in half: this 20 minutes will be devoted to science and this 20 minutes to FCAT "si-yunce." Because the two things are very different, you see.

You don't need to know you to measure soemthing

When I was in fifth grade, I'm pretty sure I didn't know about gas chromatography, but I did know that smells were caused by chemicals in the air. You don't need to know a lot to know that smells are caused by objective properties of chemicals and that there's probably some way to measure them. I knew about colors long before I knew about spectrometers, too.

Problems with a gas chromatograph

You can't actually measure "sweetness" using gas chromatography because the molecules that trigger the set of sweetness receptors are various and hard to predict. It's a similar problem with "sweetness" as a taste. You can be given the chemical structure of a molecule and not be sure if it will taste sweet because sweetness is actually the triggering of the sweetness receptor which responds based on a chemical fit. The chemical fit is impracticable to model.

With smells you have the same problems as you do with tastes with an additional issue. There's no normalization system for the sense of smell! The preceptors go directly into the brain and the brain biases these signals based on experience. This means two people with exactly the same distribution of smell neuroreceptors (which is actually quite rare) will experience smells very differently depending on the smells they have recently been exposed to!

As other poster have mentioned, you can still do statistical surveys to get a handle objective phenomenon with subjective components - a technique used in genuine, hard sciences for things like pain research. You can also measure relaxation, stress, happiness, excitement, the unpleasantness of odors (why does propane smells so bad? Because of science!).. in Engineering you can have quality assurance's "OGF" metric (overall good feeling) metric and technical debt metrics and lots of other things as well. There's also the effectiveness of sirens, emergency vehicle colors etc.. Reselling uses A/B testing for product placement.. etc.. and so on..

In practice you'd use real human noses, human population samples and some statistics to answer questions of sweetness.. As is actually done in industry.

While I agree that a normal

While I agree that a normal 5th grader should be able to infer that odors involve chemicals whose identity and quantity can be known empirically, the quality of "sweetness" (and decisions about relatively greater or lesser sweetness) is more likely to be inferred as subjective. Indeed this apparent "problem" of multiple correct answers shouldn't pose any serious dilemma to normal 5th graders, if they base their answers on figuring out which choice is "most likely" or "most extremely" consistent with the sense of the question. These questions could have been done better, certainly, but they don't pose such severe or jarring contradictions.

sweetness can be measured

There is a complete science (sensory science) which answers questions like: which sample is more sweet. Human panels are used rather than analytical chemical data. The answers from these panels are considered objective.

Indeed. Therefore item 2 is also correct.

Indeed. You can also use this technique to answer the question of whether "The song of a mockingbird is prettier than the song of a cardinal." So all 4 of the answers are correct.

All 4 are correct, and nearly identical

You can use a panel of people to determine the softness of a rose petal (although there are actual scientific tests that are more accurate)

You can use a panel of people to determine the sweetness of a flower scent (although it is also an objectively testable property in its own right)

You can use a panel of people to determine the prettiness of a bird's song.

You can use a panel of bees to determine the differential attractiveness of a flower along a petal-length axis.

So of the 4 answers, the "right* one is arguably less correct than at least two others.

The wording of question 2

The wording of question 2 makes it different from the others. Softness, sweetness, and the number of bees are all properties of the objects being studied. On the other hand, pretty is in the eye of the beholder.

A test of the relative hardness of the rose petals would tell anyone who tested it if one was softer. To put it into a different perspective, replace the two colors of rose petals with the minerals calcite and fluorite. No matter who tests them, the calcite will always be softer.

With smells, the observation is on sweetness, not how pleasant the smell is. Smell is based on chemistry. Methyl anthranilate gives gardenias part of their sweet smell, but it also gives the "sickly sweet" smell to dead animals. Both are sweet, but only one is pleasant. A chemical analysis of the scent of the flowers would tell any chemist which would smell sweeter, but not which is the most pleasing.

For answer #2, the prettiness of a bird's song will vary from person to person. Try replacing the two bird songs with music by Mozart and Madonna. Which is prettier will depend on who you ask. You would be testing the observer's opinion of the sound, not a property of the sound itself.


Indeed, rkrampf is right. Indeed.

Not by accident

There was a SF story (published originally in Playboy, I think) that dealt with this very issue. The child in school that noticed the problem with questions like those detailed above, was noted, assessed, and then if deemed genuinely insightful, put down (as in killed).

At 12, the English teacher in class read the story aloud, and then asked if anyone had previously read it. I was the only one who said "yes", but the teacher changing his mind about perusing that line of inquiry. Of course, my exposure was through borrowing a book collecting the (then) recent SF stories from Playboy.

Some comment above noted that the 'smart' student is supposed to 'game' the system, and choose the answer they know is intended as the correct one. However, why should 'smart' at science be the same as 'smart' at social engineering. Clearly the two skills do not have to correspond. Indeed, younger 'smart' students are more likely to be naive, and believe the world to be as it is described by authority figures- ie., honest and guileless.

There is a bigger picture. Research a little, and you'll discover that 'new maths' introduced into schools within many US states has significantly lowered the maths skills of the majority of even 'smartish' students. This has been no accident. Kids who really excel at maths can accept inefficient 'playful' methods of introducing fundamental concepts, but these same methods will confuse most children with even above average ability. Thus you dumb down a population with methods you can defend, because the brightest remain unaffected.

All populations are managed, never moreso than in aggressive war-making societies like the USA. Rome didn't need science thinkers from its own people- it had servile populations like the Greeks for that. America isn't this bad (yet) but America is designed to extract the wealth, resources, and skills from client states, just as Rome did.

RE: questions seem obvious frauds

I've been teaching in Florida, in FCAT-tested grades, for the past seven years. Over this time I have noticed not just many anomalies with the testing system, but with the entire educational system. Last week, our teaching staff was called to a meeting so our principal could bring to our attention the high percentage of students that are dropping out of school in Florida, primarily young black men whom were identified as having the highest dropout rate. We were shown a video presenting dropout causes, statistics and likely outcomes; and discussed ways we could assist with resolving this problem at the elementary level. After reading your post, more specifically the part about "not coming up with questions like that by accident", one particular part of the video immediately came to me. As many people probably already guessed, a life of crime was the outcome for dropouts emphasized in the video. What most people probably wouldn't have guessed is that when the state (or states, not sure if this video was specific only to FL) determines projections for the number of prisons that will need to be built in the future they use the test scores of third and fourth grade students. In brain I was thinking "so, with the raising of the standards, and then the raising of the developmental scoring levels that determine scores on the FCAT, and the apparent invalid nature of the test, and...I could go on for hours listing many other contributing factors that could (DO) affect student scores...FL is going to have some very high prison projections over the next couple of years. Unless of course, schools begin/continue cheating, as one FL charter school was just accused of doing this week. I keep wondering when/if people are going to wake up and realize what's going on; because you are right, the negative impacts of this test are too many and too much.

not exactly...

failing the test doesn't set someone up for a life of crime. It's the fact that students are behind... that they aren't even meeting the low standards many states set... that leaves little options for them. I work in a high-needs school in Illinois, and we know how critical the early years are because they lay the foundation for the later years. And the statistic is *true*-that prisons are built based on third grade test scores--but it's because the schools are failing kids. Cheating on the test and making sure less kids fail doesn't mean less kids will go to jail... it might just reduce that correlation between test scores/prison population.


The word you are looking for is fewer.

Maybe all of life should be multiple choice, with many correct answers but only one acceptable one.

And we'll let you know later whether your choice was accepted. Just to mess with you, we'll wait until it is too late to do anything useful with the test results this year and disregard the test results when we give you next year's life guidelines.

Good luck with that.



If you think of kids as we do now, as a useful substance, like gasoline, as in: "We're gonna need more kids to pour into the correctional system to keep it running," then "less" is perfectly correct. Get with the times, Nance, and remember that language evolves with us. "Kids" no longer describes a bunch of individual human creatures. Kids now describes a featureless mass to use to stock prisons and military bases.

Negative assumptions are as bad as positive assumptions

As fast as my head was spinning as I read this post, what really got me going were the following quotes:

* "We cannot assume they would receive instruction beyond what the benchmark states."
* "We cannot assume that a student who receives instruction on hardness of minerals would make the connection to other materials."
* "We cannot assume that student saw a TV show or read an article."

On their face, those all makes sense. Sure, we can't assume that a student would have seen a TV show that explained a gas chromatograph. But what about the students who _have_ seen such a TV show (or has a parent who works with them!)?

They've taken reasonable assumptions that the student may not have been exposed to certain concepts and turned them on their head and assumed that children have NOT been exposed to certain subjects.

A student who excels, who goes above and beyond, or who just has really geeky parents is PENALIZED by this test.

This insanity is worse than I thought.

(I am a parent in Florida of a 4th grade public school student, though we're fortunate enough to have her in a magnet program that doesn't put _quite_ as much emphasis on the test.)

No, they do not make sense!

No, they do not make sense! This is a basic logic fail... a => b is not the same as !a => !b. In other words, while not testing for some thing that was not taught is legitimate testing for the lack of knowledge (of true things) that have not been taught is not and the justification provided for doing so is totally unrelated to the problems presented.

The problem is clearly widespread

I have noticed that the ETS Praxis biology practice test was problematic as well. Have you had the opportunity to check these out? They are tests for teacher certification.

Years ago I did a degree in New Zealand for certification and did an analysis of the NCEA first and second level biology tests. They had a lot of mistakes and much of the information was never actually correct. I sent the Ministry of Education a small study with citations from legitimate sources. Their responses was not unlike what the State of Florida gave you.

Thomas Simmons
Kansas City, Missouri

