The Customer Service Survey
VocaLabs' weblog providing news and commentary on the challenges of providing good customer service.
Usability Testing the Holiday Letter
Monday - December 01, 2008 01:41 PM

This year my family decided to try something different for our annual holiday letter to friends and family: we're including a papercraft snowflake ornament which the recipient can cut out, tape together, and hang on the tree. Makes a great activity for the kids, and maybe hangs around a little longer than the usual newsy letter.
Since this is the first time we've tried this, I was a little worried that the ornament might be too hard to assemble, or that the directions would be unclear. So I did what any self-respecting dork would do: I enlisted relatives over Thanksgiving weekend to participate in a usability test of the holiday letter.
Obviously this wasn't as elaborate as a usability study for a multi-million dollar speech recognition application. I gave each participant (ten, including two of my kids) a pair of scissors, a roll of tape, a paperclip, and a copy of the ornament with the assembly instructions. Then I watched quietly to see how each person tried to put it together.
As a result of this usability test, I made two rounds of changes to the ornament design and instructions: the dotted "cut here" lines got bolder, some of the instructions were rewritten, the artwork was revised, and helpful illustrations were added.
I consider this a success. I found some things which surprised me, and the end design is much easier to figure out than the original version. This proves two things in my mind:
First, no project is too small for usability testing.
Second, I am a total geek.
Posted at 01:41 PM | Permalink |
How accurate are elections?
Thursday - October 16, 2008 02:10 PM

We're going into the home stretch of the 2008 election. I will be glad to have this over with: it's been almost two years since Obama ad McCain started campaigning, and with the financial crisis we've clearly reached a point where we need to choose a new leader in order to roll up our sleeves and get to working fixing our problems.
Elections are a process of choosing our leaders, and the current political philosophy is that the election should be as open and large as possible in order to reflect the will of the people as accurately as possible. This wasn't always the case: when this country was founded entire demographic groups were deliberately excluded from voting (women, slaves, etc.) on the theory that not everyone is "qualified" to help choose the President, congressman, mayor, or dogcatcher.
But how accurate are elections at reflecting the "will of the people"? I don't know the answer to that question (though maybe someone working for a political polling organization does), but it's obvious that elections suffer from several problems which can cause the outcome to differ from what it would be if you could poll everyone theoretically eligible to vote:
- Self-selection: citizens choose whether or not to vote, and there's no attempt to correct for the relative opinions between those people more and less interested in voting.
- Demographic bias: many demographic factors strongly affect which candidate someone is likely to vote for (age, income, gender, race, and many others), and voting rates are dramatically different among different demographic votes.
- Order bias: on most ballots, candidates are listed in the same order on every ballot, rather than randomizing the order to correct for precendence and recency bias.
The irony is that political pollsters, in order to accurately predict the outcome of an election, have to correct their survey process in order to replicate the inherent biases of the election itself. In many ways it's easier to construct a survey which accurately reflects the opinions of the population as a whole than to figure out all the ways in which an election will differ from an across-the-board opinion survey.
Posted at 02:10 PM | Permalink |
Shifting Sands
Monday - September 29, 2008 03:37 PM
Was it really only a couple weeks ago that I announced the kickoff of Service Quality Tracker for financial services companies?
Since then, a third of the companies we're tracking have disappeared. Of the six, Wachovia and Washington Mutual have been merged with larger banks in order to prevent their complete collapse. We're not sure how this is going to play out from a customer service perspective, but I have no doubt it will be interesting.
For the time being, we'll continue to track all six companies independently. When Wachovia and WaMu are operationally merged with their new parents (which is to say, when the call centers and phone numbers are unified) we'll combine the data.
Posted at 03:37 PM | Permalink |
Calculating scores
Wednesday - September 10, 2008 02:08 PM

One of our clients is setting up a survey to calculate customer service scores for its employees, based on the answers to several different questions in the survey. These scores will be used for, among other things, calculating bonuses, so it's important to get them right.
There's a wrinkle, though. Not every question is asked on every survey, and the client (understandably) wants to exclude unanswered questions from score calculations completely--but that means there's more than one way to calculate the average score for each employee:
-
Calculate the number of points possible and the number of points earned on each individual survey, then calculate a percentage score for each individual survey, and average the percentage scores across all surveys.
This method weights each customer equally in the final score, no matter how many questions the customer answered.
-
Calculate the total number of points possible on all surveys and the total number of points earned on all surveys, the calculate an overall percentage score of possible points earned.
This method weights each question equally in the final score, so interactions where the customer gets a shorter survey form are weighted less.
It's important to clarify which method to use, since there's a large block of questions which will be asked on some surveys but not others.
To put this in concrete terms, let's suppose there's a sales rep, Joe, and only two survey questions, Q1 and Q2. All customers answer Q1, but only half the customers answer Q2. Joe gets 75% of possible points on Q1, but only 25% of possible points on Q2.
Using method (a), Joe's overall score is 62.5%.
Using method (b), Joe's overall score is 58.3%, because of the extra weight assigned to those customers who were asked Q2.
There are good reasons to use method (b): it's a simpler claculation, and the client may want to weight more complex customer interactions (which may also be the ones with the longer survey form) more heavily. In practical terms, it might not actually change scores much (depending on patterns in the data). On the other hand, I think most people who notice this subtlety would want to use method (a), even though (b) is the more likely default choice.
Posted at 02:08 PM | Permalink |
Oversurveyed
Tuesday - July 22, 2008 12:13 PM

Back in May, I commented on the fact that I've heard from several companies that their survey response rates have been slowly declining over the years.
Over the past two months I've been keeping track of all the survey offers I've received. During that time, I recorded 59 survey offers: nearly one every day (I certainly missed some, too, but I made an effort to log them all). These 59 surveys comprised:
- 21 register tape survey offers, all of which required me to visit a web page to actually take the survey.
- 18 E-mail survey offers which linked to a web page with the survey itself. Two of these looked so much like spam that I almost didn't include them until I had verified they were really from the company claimed.
- 16 popup web page surveys which appeared while I was surfing the web (counting just the legitimate surveys, not the disguised advertisements)
- Only four paper surveys, all from small organizations I interact with heavily.
- 30 of the survey offers (over half) were duplicates, offers to take the same survey I had already been offered before
- 17 of the survey offers had an incentive of a drawing for cash or merchandise worth over $1,000 (excluding incentives of dubious value like "a free copy of the report")
- 17 of the survey offers had no incentive. These 17 surveys were also much more likely to be of a reasonable length and not duplicates of other survey offers.
- 29 of the surveys I estimated would take over ten minutes to complete. To put this in perspective, response rate tends to drop meaningfully if the survey takes over five minutes.
- 14 of the surveys made it difficult to tell--even while taking the survey--how long it actually would take. One survey turned out to be about three times the length promised in the original offer.
I'm not sure what all this means yet, but a few things are clear for me as a consumer: There's no way I'm going to take a customer survey every single day, especially since half the offers were duplicates. I'm also not terribly interested in spending over ten minutes taking a survey, and especially not 30-40 minutes (which is how long one survey claimed it would take). Big incentives ($5,000 drawing!) don't do much for me, since the size of the giveaway just tells me that the odds of winning are miniscule.
There are also a few things clear for me as a survey professional: Most of these surveys do not respect the person being asked to take the survey. Most surveys were much longer than they had to be, many suffered from redundant questions, a large fraction didn't make it clear how long the survey would take (or worse, grossly understated the actual length of the survey).
Most of my suggestions for improving survey response boil down to one thing: respect. Respect the consumer, respect the consumer's time, respect the consumer's intelligence, respect the consumer's privacy, and respect the fact that the consumer is doing you a favor by sharing her opinions. Most of the people responsible for these 59 surveys could do a lot better in this department.
Posted at 12:13 PM | Permalink |
Multimedia surveys
Wednesday - May 14, 2008 02:22 PM

Multimedia surveys require some care, especially in the analysis phase. Each version of the survey needs to be treated separately, since even if the questions are worded identically they will be presented differently. Different media will attract different demographics, and you can't assume that the same response on two different versions of the same survey mean the same thing.
Dos and Don'ts for Multimedia Surveys
Do consider offering a survey in different media in order to understand the biases and limitations of your survey. For example, you can combine an online survey with a phone interview to to get around the different demographics and response rates in the two kinds of surveys.
Don't use an identical survey in each medium. There are significant differences in the kinds and format of questions which work in each medium, and a single form designed for two media will not be optimal in either.
Do look carefully at differences in demographics and responses in each medium, and take those into account when comparing the results of the two surveys.
Don't combine data from multiple versions of the same survey without carefully adjusting for the differences between the sample and biases of each version.
Do take advantage of the strengths of each kind of survey. For example, phone interviews allow an in-depth discussion and detailed follow-up questions, while an IVR survey is inexpensive. So an IVR survey can be used to collect a very large sample for a few key questions, while the phone interview complements it with a smaller number of much more detailed surveys.
Don't assume that offering a more convenient version of the survey will increase response rate. You need to take a careful look at why customers aren't taking the survey, and if inconvenience isn't the problem then more convenience won't help.
Posted by Peter Leppik
Posted at 02:22 PM | Permalink |
User-Centric Design (to a fault)
Friday - March 28, 2008 01:35 PM

A couple days ago, I wrote about a research paper comparing broad vs. deep menu designs for a phone menu. One quirk of the research was the uniquely user-centered design process which led to some rather (ahem) unique decisions.
The menu structure they came up with for the "deep" design was:
Listen to Messages: Next, Previous, Repeat
Respond: Reply, Reply to All, Forward, Delete
Distribution: List Recipients, Add Sender
Message Details: Mark Unread, Time and Date
So if you wanted to go to the next message, you would first have to say, "Listen to Messages" and then choose the "Next" option from that submenu.
My immediate reaction to this design was along the lines of, "WTF? Why is 'Delete' in a submenu called 'Respond'? I would never look for the 'Delete' option under the 'Respond' menu, and that's probably one of the most used functions. What sane VUI designer would build this menu tree?"
It turns out that this menu structure wasn't built by any VUI design (sane or otherwise), but rather through a uniquely user-centric process which illustrates a hazard of blindly applying user data to the design process. Here's what they did:
Step 1: 26 users were asked to organize the eleven functions into logical groupings of five or fewer functions. Each user's groupings were analyzed, and an aggregated grouping was generated with the following groups:
1. Delete, Forward, Reply, Reply to All
2. Repeat, Next, Previous
3. Mark Unread, Time and Date
4. List Recipients, Add Sender
So far so good, though there's no reason why some functions (especially heavily-used ones like "delete") can't be included in multiple groups.
Step 2: 101 users (not the same ones as in Step 1) were given the four groups from Step 1 and asked to suggest a label for each group. The responses were compiled, and the researchers identified the most common suggestions for each group:
1. Delete, Forward, Reply, Reply to All: "Action" (volunteered 22 times), "Respond" (volunteered 15 times)
2. Repeat, Next, Previous: "Navigate" (volunteered 22 times), "Listen to Messages" (volunteered 15 times)
3. Mark Unread, Time and Date: "Message Details" (volunteered 10 times), "Status" (volunteered 10 times), "Miscellaneous" (volunteered 5 times), "Options" (volunteered 5 times)
4. List Recipients, Add Sender: "Address Book" (volunteered 9 times), "Distribution" (volunteered 9 times)
Here we start to see the beginnings of trouble. First, while the paper's authors don't disclose the exact wording of the instructions to the survey participants, it appears that they asked participants to "label" the group; in other words, offer a short description. However, that's the opposite of what a user of the application needs to do: a user needs to take a set of labels and guess which label contains a given function, rather than take a group of functions and describe them with a label.
As descriptive terms for the groups, the labels are fine. As guideposts to the functions contained in each group, many of the labels fall short.
The other problem is that none of the labels were volunteered by more than one in four participants. This should have been a red flag that the labeling isn't obvious, has no user consensus, and needs to be treated with some care. There's no evidence that the paper's authors did anything more than accept the survey results at face value.
Step 3: 155 users (not the same as in Step 1 or Step 2) were given the most common labels for each group, and asked to choose the most appropriate label for each group. The most popular label was used in the application:
1. Delete, Forward, Reply, Reply to All: "Respond" (66%) beat "Action" (34%)
2. Repeat, Next, Previous: "Listen to Messages" (83%) beat "Navigate" (17%)
3. Mark Unread, Time and Date: "Message Details" (40%) beat "Status" (25%), "Options" (26%) and "Miscellaneous" (10%)
4. List Recipients, Add Sender: "Distribution" (64%) beat "Address Book" (36%)
This is where "Delete" managed to get in a menu called "Respond:" because it was grouped with two other functions which are variations on "Reply," most survey respondents thought "Respond" was a more "appropriate" label for the group than "Action."
Just as in Step 2, part of the problem is that the survey participants were asked to do the wrong thing: they were asked to choose the "most appropriate" label, not which label they thought a given function belonged under. The end result is perverse, even though none of the steps sound (on the surface) unreasonable.
User Centric Design
The lesson we take from this is that while user input and surveys are critical input to the design process, they cannot replace the design process. Surveys are valuable tools, but they are not perfect, and require some intelligence and interpretation. Most importantly, they require a healthy skepticism.
In this situation, while I applaud the effort to gather lots of data about user preferences, nobody ever stopped and asked questions like "Does this result make sense?" "Did we apply the survey properly?" and "Did we ask the right questions?"
There were also some artificial constraints applied to the design, such as requiring every function to belong to exactly one submenu. A human designer would probably have elevated the most common functions to top-level status and push others into logical subgroups, maybe something like this:
Delete
Next
Repeat
Previous
Respond: Reply, Reply to All, Forward
Info: Add Sender, List Recipients, Mark Unread, Time and Date
(this particular grouping mildly violates the "no more than five options" limit at the top menu, but only mildly since "Next" and "Previous" aren't offered when listening to the last or first message. A professional could probably dream up something equally functional which doesn't violate the "five options" rule)
It would be interesting to know how the broad vs. deep applications would have compared with a more reasonable "deep" design. Unfortunately, we may never know the answer, since that will require re-doing much of the work behind this paper.
Posted by Peter Leppik
Posted at 01:35 PM | Permalink |
How Not To Write a Survey
Wednesday - February 27, 2008 03:17 PM
I've obfuscated the name of the company since I don't want to embarrass anyone, but I think it's valuable to point out some of the mistakes so that others might read it and learn. At least I can hope. Here are screenshots of the survey: Page 1, Page 2, and Page 3.
Among the mistakes:
- Several Yes/No questions allow the participant to select both Yes and No.
- Several implied questions, such as in questions 1 and 2. On question 1, for example, the participant is asked to skip the question if he didn't experience incompatibility issues, rather than actually answering the question "Did you experience incompatibility issues?" This implied question is confusing to the participant and makes the analysis much more difficult.
- Nonsensical choices to questions, such as in question 1: "If Yes, did you resolve the issues or did you have to submit the Offline (paper) application?" The choices are "Yes" and "No," which is not what the question is asking.
- Question 4 is a grid asking the participant to rate several items on a scale of "easy" to "difficult," but several of the items being rated are not "easy" or "difficult," for example, "Responsiveness" or "Amount of time it took to set up an account."
- Question 5 asks the participant how long the process took and provides several options, and the participant can select multiple options (such as both "4-7 days" and "over 15 days") which should be mutually exclusive.
- Question 6 asks a Yes/No question, but offers a Likert-scale of choices (along with an odd set of instructions).
I should point out that the group which created this survey has absolutely nothing to do with the projects we work on for this client. Perhaps we should offer our services.
Posted by Peter Leppik
Posted at 03:17 PM | Permalink |
Google's new Forms feature
Thursday - February 07, 2008 02:37 PM
This is not going to replace the Global Enterprise Distributed Mega-Survey Research Platform, but for simple informal surveys it could easily give other free services (like Survey Monkey) a run for their money, especially since Google doesn't limit the number of responses or the length of the survey.
So my verdict: Cool toy, with some real use for quick-and-dirty projects. Combined with other collaborative features on Google Docs, this could be a really interesting service.
Posted by Peter Leppik
Posted at 02:37 PM | Permalink |
Customer Service Surveys vs. Political Polling
Monday - February 04, 2008 02:26 PM
1. Customer Service Surveys are measuring past experiences rather than predicting future behavior. Measuring past experiences is a much easier problem.
2. Nobody accuses us of bias if they don't like the results of a Customer Service Survey.
3. Demographic differences tend to be relatively small in a Customer Service Survey.
Things which are harder for Customer Service Surveys:
1. Time is critical, and customers should be surveyed within an hour of the end of the call.
2. To make the survey most useful, survey data needs to be matched to other data from inside the call center (such as agent ID, type of call, etc.). Political polling data stands on its own.
3. Call centers value having a long history of consistent survey data, making it difficult to add, remove, or change survey questions.
Things which are pretty much the same for all surveys:
1. Proper sampling technique and getting a statistically large number of survey responses is critical for having meaningful data.
2. Details of how the survey is administered can make a difference in the results.
3. The top-level statistics (the "headline") really only tell part of the story.
Posted by Peter Leppik
Posted at 02:26 PM | Permalink |
Brrrrrrrr
Wednesday - January 30, 2008 04:01 PM
And stumbled across this form, which the U.S. Antarctic Program uses to gather feedback about Extreme Cold Weather (ECW) clothing. You can rate the usefulness of gear like "Boots, white, bunny boot" or "Mittens, green, furback gauntlets" on a scale from "Very Useful" to "Useless."
I might critique some elements of the survey design--for example, with only nine choices of items to rate, they should use a drop-down menu instead of having the person type the item name; and they might want to consolidate some of the free response. But really, with a survey this specialized, they're not gathering tons of statistically meaningful data. It's really more like a suggestion box.
What's really interesting about this is:
1) Even government science agencies are looking at customer service these days,
2) They make only a minimal attempt to ensure that people completing the surveys are actually part of the target population (people deployed to Antarctica) and not, for example, shills working for clothing manufacturers, and
3) I'll go to almost any length to find some way to connect customer surveying to current events.
Posted by Peter Leppik
Posted at 04:01 PM | Permalink |
Proofreading
Friday - November 30, 2007 01:56 PM

Proofreading is one of those necessary chores everyone hates to do.
Ah, who am I kidding. This whole article is just an excuse to show you this screen capture from an online survey (originally from Worse Than Failure):

Posted by Peter Leppik
Posted at 01:56 PM | Permalink |
Boost Your Response
Tuesday - November 06, 2007 03:26 PM

I see a lot of questions about survey response rate--usually from people who are only seeing a couple percent response rate, and they're wondering (1) why, (2) if it's affecting the data, and (3) what they can do to improve.
1) It could be any of a number of things
2) It depends on the answer to (1)
3) Fix (1)
Why do People Complete Surveys?
Rather than asking why people don't complete surveys, let's first ask why they do.
Usually, someone will complete a survey if three things are true:
a) Someone asks him,
b) It's not too much of a nuisance, and
c) He thinks it matters.
Note the complete absence of any direct reward to the person completing the survey. We've found that incentives are usually a poor way to boost response because most people are more strongly motivated by social factors, and the people who take the survey strictly because of the incentive aren't as careful about providing good feedback. (Not that incentives should always be avoided, but any incentive should be more along the lines of a tangible thank-you than direct payment for the survey).
Here's a quick list of things to look at if your survey response rate is too low:
a) Did you ask the customer to take the survey? It's not enough to merely ask the hypothetical question, "Would you like to take a survey after the call," you have to actually present the survey to the customer and ask, "Will you take this survey right now?" If you're relying on the customer to remember to take the survey, it's not going to work. Your survey is simply not a high enough priority in the life of an average consumer.
So if you're not actively reaching out to the customer with the survey--phoning, e-mailing, etc.--you're not really asking.
What's more, you have to actually get that message to the customer. The survey has to be delivered in a medium the customer is willing to use (which usually means a phone call if you're surveying about phone service, an e-mail if you're surveying about online service, etc.), and in a way that it won't get filtered out as just another marketing message.
b) Is the survey too much of a nuisance? Always respect the customer's time, and remember that the customer is doing you a favor. If the favor becomes burdensome, you'll lose out.
Customers won't want to take surveys which are too long, too intrusive, or too difficult. We've found that a good upper length for surveys is about five minutes for a phone interview, a handful of questions for an IVR survey, about a page for a written survey, and around 15 questions for a web form. Longer surveys will be refused more often. Avoid intrusive questions like age, income, race, etc., and don't make the questions too hard to understand or answer. You're looking for opinions, not deep thought.
c) Does the survey matter? Even if the survey is delivered properly and not too much of a nuisance, customers will often refuse if they think that the company doesn't really take the survey seriously.
On the flip side, one of the most powerful ways to boost response is to effectively "sell" the survey to the customer by communicating that the survey is important and the result will be carefully reviewed and acted upon by the company.
The best way to deliver this message is by a direct personal appeal from a live human, and the more authority the better. Very few people will refuse if the CEO personally comes on the phone and asks them to take a five-minute survey (on the other hand, very few CEOs are willing to pull that duty).
If the CEO is unavailable, the next best approach is to use a live interviewer who describes herself as someone working on behalf the company to help improve customer service (for example, she can describe herself as a "quality specialist" or emphasize how the company is using the data to improve service).
The least effective option is an automated request and survey, since the very fact that the survey is automated implies that the company isn't willing to spend a lot of money on it. Even there, however, we've found that an effective recording, emphasizing that this call was specially selected and the company cares deeply about the results, can improve the response rate by a factor of two or three.
The Impact of Poor Response
All else being equal, a poor survey response leads to lower-quality data and a more expensive survey.
When the response rate is low, survey-takers are more likely to be people with strong opinions, and as a result the survey scores aren't truly representative of the customer base. There can be other biases, too, such as technical barriers to taking the survey, or even customer service representatives actively working to keep unhappy customers out of the survey.
The survey also winds up being more expensive, since you have to present the survey to more customers to get the desired number of completed surveys.
Survey response can vary from well under 1% (which I've heard about in some poorly managed automated surveys) to over 50%. At VocaLabs, we regularly achieve over 50% response in some of our live interview surveys by combining an efficient process with skilled interviewers and a well-designed script.
Posted by Peter Leppik
Posted at 03:26 PM | Permalink |
"Survey Availability Bias"
Wednesday - October 17, 2007 04:03 PM

In a blog entry today, economist Arnold Kling cites four examples of what he sees as wrong but commonly held ideas: the notion that Kerry won the 2004 election but it was stolen; the belief that men have more sex partners than women; the belief that an epidemiological study is the scientific equivalent of a clinical study; and the belief that happiness doesn't increase with income beyond a certain point.
Is that really a fair critique? Let's look at the four items in more detail:
1) While there was some muttering about exit poll results from Democratic activists after the 2004 elections, it's not clear that the "Kerry won" idea has much traction at all any more. Leaving that aside, though, the source of the meme was because leaked preliminary exit poll results seemed to show Kerry winning in Ohio. The final data, corrected for known sampling errors, didn't show Kerry winning Ohio. All this proves is that preliminary data is preliminary for a reason, and confirmation bias (the tendency to only believe evidence which supports your preexisting opinion) is alive and well.
2) Surveys of sexual behavior have consistently shown that men claim to have more sexual partners than women over their lifetimes. If the sample is unbiased and only heterosexual sex is counted, this is supposedly a mathematically impossible result (* but not really--see below), therefore it must be wrong.
What's going on here is the well-known fact that people tend to shade the truth on surveys to conform to social norms (some of us call it "lying"). As long as we're on the topic, I'll point out that survey-takers also lie about their age, income, weight, and whether they get the oil changed in their cars every 3,000 miles.
But to be an example of "Survey Availability Bias," Kling would have to show that because of the survey people actually believe that men have more lifetime partners than women. Very few people actually argue that these surveys are literally correct. On the other hand, the surveys do shed interesting light on the social norms of our culture, and that's surely worth discussing.
(* The result that men have more lifetime partners than women is not actually mathematically impossible if you consider the fact that the survey doesn't ask how many partners someone had in his or her life, but rather how many partners someone had in his or her life up to that point. If young men are more promiscuous than young women, and women are much more promiscuous than men later in life, then the low number of partners for youngwomen will drag down the average for the whole group, even though a "deathbed sample" would give identical results for men and women. This would imply lots of hookups between younger men and older women, though I haven't done the field research to know if this is actually happening. Realistically, even though "cougars" could hypothetically explain the survey data, the real reason is almost certainly that people lie.)
3) It's certainly true that a lot of epidemiological studies get hyped in the media, and a significant fraction of them later turn out to not hold up under close scrutiny. Even though many epidemiological studies aren't "surveys" in the "go ask a bunch of people some questions" sense, this is a legitimate criticism of the way health reporting is usually done in this country. Reporters are often not skeptical enough of the results, and overlook the fact that epidemiological studies can usually show only correlation, not causation.
On the other hand, you can find lots of examples of the media overselling a story which has nothing to do with surveys or epidemiology: early claims that the iPhone would sell a million units in its first weekend (Apple only manufactured something like 350,000); periodic panics over sharks, jellyfish, or syringes on our public beaches; child abduction by strangers; and many others. All this proves is that the media hates to look too closely at a good story.
But do people really consider epidemiological studies to be the scientific equivalent of clinical studies? I suspect if you asked the question of the general population, about 85% would answer "huh?" Among the small fraction of people familiar with the terms, I suspect most would say, "no," even if they couldn't articulate the reason. "Clinical" carries a connotation of precision and control which "epidemiological" doesn't.
Among true experts, of course, the answer would be "it depends on the clinical study and the epidemiological study." The point that Kling is missing here is that there is good and bad research of both types. Clinical studies have the advantage of being able to show causation where epidemiological studies can usually only show correlation. On the other hand, clinical studies are orders of magnitude more expensive per participant, so it's possible to have a much larger and more diverse sample in an epidemiological study. This lets the researchers tease out much more subtle effects or look at a wider range of phenomena. It is too simplistic to say that one type of study is better than the other.
4) Happiness research is a new and somewhat faddish area of social sciences, and one of the more intriguing findings has been that more income does not necessarily increase happiness, all else being equal. Kling, apparently, doesn't believe this (probably because he's an economist, and traditional economics sees money as the primary motivator for decision making), and uses it as an example of something which mustbe wrong despite the survey data.
Kling's argument, which I reproduce in full, is: "some economists take seriously the notion that people are not happier at higher income levels, even as they point out that people have a choice of whether to earn higher income or take more leisure." In other words, people with higher incomes chose to have more income rather than sacrifice some income for more leisure time, therefore that must be the decision which makes them happier.
This is simply wrong on many levels.
First, merely having the choice between income and leisure (as opposed to not having that choice) may actually make you less happy. It is true that wealth and income bring more choices, but it's also well established that too many choices can lead to lower satisfaction and decision paralysis. So we can't assume that being able to choose income over leisure will make people happy (or vice-versa).
Second, Kling implicitly assumes that high income people are choosing the path which leads to greater happiness. I'm not sure why he thinks this is a reasonable assumption, since I've seen very little evidence that people in the real world make choices which will make them happy. On the other hand, I've seen lots of instances of people making choices which make them unhappy, either because of mistakes, short-sightedness, inertia, misplaced loyalty, or a desire for more money.
Third, Kling also assumes that people with high income actually can choose to incrementally sacrifice some income for some leisure. Most of the high income people I know are in jobs where the choice is strictly binary: either lots of income and no leisure, or no income and lots of leisure. Either put in 80 hours a week and collect the big bucks, or get fired.
So with the happiness surveys we have a case of survey data on the one hand (which has its limits), and a purely hypothetical and rather dubious claim on the other hand. From my perspective, I'd have to say that the point goes to the survey data.
Where does this leave Kling's "Survey Availability Bias?" Of his four examples, one is confirmation bias among a vanishingly small number of true believers, another is a classic bias where nobody believes the data is literally true, a third is bad reporting combined with an apparent bias (on Kling's part) towards clinical research, and the fourth is an example of the data contradicting an axiom of traditional economics where Kling declares the data the loser.
Of course, I'm not an unbiased source, since my company sells surveys. But Kling's claim that "Surveys add noise rather than signal to our society" is nonsense. Surveys are a tool for understanding human behavior, and like any tool they can be used properly or improperly. Sometimes surveys are the best way to collect data, sometimes they're not, and often surveys--however flawed--are the onlyway to collect data on a particular subject. But like any other scientific instrument, a survey needs to be analyzed carefully, calibrated properly, and used in an appropriate way.
Posted by Peter Leppik
Posted at 04:03 PM | Permalink |
What's Wrong with the Rank Order Question?
Tuesday - October 16, 2007 02:55 PM

Time for a little descent into survey geekery, with a short rant about the rank order question.
Please rank these kinds of vegetables from 1 (favorite) to 4 (least favorite):
____ Carrots
____ Celery
____ Artichokes
____ Arugula
Rank order questions don't appear very often on customer service related surveys, but you do see them from time to time in market research. So why don't I like them?
1) They're more confusing than other kinds of survey questions.
2) They force the participant to make a choice even if s/he doesn't have an opinion (most rank order questions don't allow the participant to specify a "tie"), which can lead to the illusion of stronger data than what really exists.
3) They only work in writing: it is difficult to craft a rank order question which doesn't bog down hopelessly in an interview or (even worse!) an IVR survey.
4) A rank-order question actually provides lessdata than a series of Likert-scale questions (for example: How much do you like carrots? A lot / somewhat / a little / not at all), and the Likert-scale questions are easier to understand, quicker to complete, and translate better into different media.
So next time you're tempted to write a rank order question, try writing it as a series of Likert-scale questions instead. You'll probably be happier with the results.
Posted by Peter Leppik
Posted at 02:55 PM | Permalink |
Political Polling
Tuesday - October 02, 2007 03:59 PM

Like a freight train in a fogbank, we're barreling towards another national election with only a vague idea of what lies ahead.
Political junkies know to ask several important questions when interpreting the results of a poll:
1. What's the margin of error, and is the survey accurate enough to support the conclusions being drawn?
2. Is the sampling method biased in any way which would change the results?
3. Are the questions fair and unbiased, or are the questions designed to elicit a particular response?
These are all important questions to ask about customer service surveys, too, and many existing surveys fall down on at least one of the three points. In the call center environment there are some important additional considerations, however, which don't come up in political polling:
4. Is the data timely enough to be actionable?
5. Does the survey itself leave customers with a positive or a negative impression?
6. Is the survey data matched to other call data (agent name, type of call, etc.) for analysis?
Sometimes the latter three issues will impact the first three--for example, a customer service survey might specifically focus on a group of customers which are likely to have had problems, in order to look for ways to improve. Since we're not trying to predict the outcome of a future event, sometimes it's desirable to sacrifice some accuracy in order to achieve the larger goal of improving service.
It's important, though, to treat the results of every survey with the same skepticism you'd apply to the political poll showing that your favorite candidate is going down in flames. Ask the important questions, and know what the data says (and doesn't say).
Posted by Peter Leppik
Posted at 03:59 PM | Permalink |
Clean Lists
Monday - July 16, 2007 02:42 PM
Apparently, this government agency wants to survey a particular type of company (which VocaLabs is not). About a year ago, they sent three people at VocaLabs e-mail invitations to participate, which we ignored since the survey doesn't apply to us.
A few months later, we got another set of e-mails, which we ignored again.
A few months later, we got a third set of e-mails, except that one of the recipients had since left VocaLabs, so the message was not only misdirected to the wrong company, but to the wrong person as well.
Then today, we got two copies of this same survey via postal mail, sent to our old address (which we vacated over six months ago), one of which was mailed to someone who left VocaLabs over two years ago.
And the survey still doesn't apply to us.
This state agency is probably wondering why they're not getting a good response to their survey, and you can't fault them for not being persistent enough (personally, I would have given up after much less than a year). I'm guessing that they bought the mailing list from a commercial list broker, but the list they got was wildly inflated with irrelevant names.
Fortunately for VocaLabs, it's usually pretty easy to figure out who to survey in a call center environment: the customers who just called you. In market research, as this state agency has discovered, it can be very difficult to get a clean list of relevant people to survey.
Posted by Peter Leppik
Posted at 02:42 PM | Permalink |
End-of-Call Survey Backlash?
Thursday - July 12, 2007 02:39 PM
I have to be cautious, since "data" is not the plural form of "anecdote," but it looks like there may be the early stages of a trend here. Many companies have now had their end-of-call surveys in place for a few years, long enough to understand the process and discover its flaws. Some early adopters of end-of-call surveys are starting to look for better alternatives.
Of course, if this really is a trend then it's tremendously helpful to VocaLabs, since our Express Feedback surveys have the timeliness of end-of-call surveys without the disadvantages.
Or maybe there's no trend at all, and this is just wishful thinking.
Posted by Peter Leppik
Posted at 02:39 PM | Permalink |
Survey Length
Wednesday - June 27, 2007 02:10 PM
The refusal rate for any given survey can be affected by the length of the survey, the perceived intrusiveness, the customer's relationship with the company, whether the customer had a very positive or negative experience, whether the customer thinks the data will be taken seriously, any compensation offered for taking the survey, and many other factors.
What's more important than the actuallength of a survey is how long the customer thinksit will be. In our experience, once the customer agrees to take the survey she will rarely give up before the end (at least with a live interviewer) unless the survey is much longer than promised. People don't hang up at five minutes 30 seconds on a five minute survey (though don't push your luck too hard--promising five minutes and then doing a half-hour interview will probably bring bad results).
All these factors make it hard to tell exactly what the relationship is between the length of a survey and how many people will take it, but we do have a little bit of data. A while back, we were going through a survey design process for a live interview call back. The first version of the survey was quite lengthy, and the interviewers told customers that it would be a ten-minute survey. With that version of the survey we had a refusal rate of about 25% (the refusal rate is the percentage of people we attempted to survey who, when contacted, refused to take the survey. This is distinct from wrong numbers and people who never answered the phone).
After running that survey for a while, we convinced the client to take a fat red marker to the script and cut out a lot of redundant questions. The second version of the survey was about half the length of the first version, and interviewers told customers it would be a five-minute survey. With no other changes in the survey process, the refusal rate dropped to about 15%.
So going from ten minutes to five minutes in this particular survey cut the refusal rate from 25% to 15%. Other surveys will be different, of course, because of all the other factors involved, but this tells us that there is a significant but not overwhelming percentage of customers who will take a five minute survey but not a ten minute one.
Intuitively, I doubt you can get much lower than a 10% refusal rate without taking extreme measures (which might backfire), so in this case, making the survey shorter than five minutes probably would not have boosted the response much.
As a rule of thumb, for live phone interviews, I think that any survey that's five minutes or less is a good length and won't benefit much from being made shorter. Surveys much longer than five minutes will start to see a drop in response rates, so how long is too long depends on how much survey response you're willing to trade off for whatever additional data you're gathering.
Posted by Peter Leppik
Posted at 02:10 PM | Permalink |
Off Target
Wednesday - June 13, 2007 12:38 PM
Amazingly, the reporter for the news article actually managed to interview someone (a survey guy, even) who didn't see the problem with this.
The questions read like something you would find on a standard psychological profiling tool--the kind of thing your doctor might ask you to fill out before prescribing antidepressants. The point of the survey was probably to try to build a profile of Target customers in things like anxiety, loneliness, etc.
Nielsen/Netratings (the company hired to implement this gigantic brain fart) probably convinced Target that there was nothing particularly intrusive about a standard psychological assessment, and both companies overlooked the fact that if someone asks if you're worried that your wife might leave you, there's a world of difference between a therapist asking the question and a discount retailer.
Frankly, I'm astonished that this ever happened. One of the things anyone writing surveys should know is that the context of a survey makes a huge difference. Survey questions don't exist in a vacuum, they exist in the context of the relationship between the person (or entity) asking the questions and the person answering them.
Anyone who's ever asked the income question ("What's your annual income?") knows this. Asking a retailer's customers questions like "I am afraid of being rejected by my friends" is nothing short of sloppy.
Posted by Peter Leppik
Posted at 12:38 PM | Permalink |

