Thursday, September 27, 2012

Academic Requesters: How to prevent retakes on your surveys without risking worker accounts

EDIT: Current information is on THIS POST. The information below is dated and might not work.

Preventing retakes on your survey is the most common issue that requesters have with mTurk. This guide explains an easy way to prevent retakes just by using the Amazon mTurk Command Line Tools (CLT) that work on Windows, OS X and GNU/Linux.

Note about Amazon's Block Method

A Google search will yield this forum post where an mTurk representative tells requesters to use the block worker function to prevent survey retakes. Unfortunately, Amazon's system is not perfect, so even if you do this a worker may still get the following form email:

Greetings from Mechanical Turk.
We regret to inform you that you were blocked from working on HITs by the following requester(s):
Example
Requesters typically block Workers who submit poor quality work.
Requesters rely on Mechanical Turk for high quality work results. In order to maintain quality, we continuously monitor service activity. Should additional Requesters block you in the future, we may suspend your account. Please ensure your work quality is at a high standard. We encourage you to read the HIT instructions and follow them carefully.
We realize that this block may be a onetime occurrence for you. Should you maintain high work quality with no further complaints for the next few months we will dismiss this event.
Regards,
The Mechanical Turk Team

Not only that, if you want to give our multiple surveys of the same type workers who took them before won't be able to take a different kind of survey from you.
IsaacM says this should never happen but experience begs to differ as requesters who use this method still get these emails sent out to workers. (There are many posts on the mTurk worker forum to this effect.) If there is a "glitch" in mTurk workers are going to be emailing you pleading for an unblock and you will get a bad reputation which may make responses from future surveys come more slowly or not at all.

Preferred Method: Qualifications

Instead of potentially causing a lot of trouble, you can use mTurk's qualifications to keep workers out of surveys. Basically, how it works is this: Your survey has a qualification pre-attached to it called "Did my survey" and it requires a value of 0 from a worker, a worker requests the qualification and it is auto-granted to them at a value of 0, and once they take the survey the qualification is incremented by 1.
First, create a file that looks like this, and name it something like no_retakes.properties:

name:No retakes please!
description:Prevent retakes on my survey
keywords:prevent, retakes
autogranted:true
autograntedvalue:0

To make the qualification, execute this command with the mTurk Command Line Tools (note: In this post, I've used the Windows syntax for all command line examples. For OSX/Linux, prepend the characters ./ to the beginning of the command and append .sh to the end of the first word, e.g. createQualificationType.sh):

createQualificationType -properties noretakes.properties

You will get a QualTypeID printed to standard output as well as to a file called noretakes.success. You need this ID to change the values later.
Make sure to add this qualification to your hit.properties file if you are making a new HIT with the CLT or to the necessary qualifications in the Hosted Requester GUI. Remember that the value should be 0.
Once the first run of the survey is complete, you can now raise everyone's values.
Whether you used the getResults command line tool or the web UI to get your results file, you should still have all of the work ID's who submitted work to your survey. Create a tab-delimited text file with the columns workerid and score. A few programs can create these files, such as Microsoft Excel and LibreOffice Calc. In Excel, they are called .tsv (tab separated values) files while in Calc they are saved as .csv but with a different delimiter (namely, tab or \t).
Your file should look something like this, with the symbol → representing a tab:

workid→score
A1EXAMPLE→1
A2EXAMPLE→1
A3EXAMPLE→1

Then, run the following command to update everyone's qual score (note, the -qualtypeid parameter takes the QualTypeID generated earlier and stored in no_retakes.success)

updateQualificationScore -qualtypeid TPREVENTRETAKESQUALIDEXAMPLE -input noretakes.tsv

Harder method: Internal lists

A harder method to prevent retakes is to ask for a worker's work ID when they begin your survey, and if it's found on a list, tell them to return the HIT. Doing that is far out of the scope of this article, though, and this method is better as it will prevent any workers from even accepting a survey they cannot do, which will leave them on the site longer for people who can do them!
I hope this was useful for you and I hope your survey gets lots of replies. :) Remember to pay workers fairly! (at least 12 cents per minute)

Wednesday, September 26, 2012

Do not threaten workers in your instructions

This was the title of a thread on TurkerNation. The screen cap pretty much says it all. This turns off workers.

Post from turkernation used with permission

Let me explain a little further so you can understand why this is a pretty big deal for workers.

Rejections- All workers get them eventually. Workers understand that sometimes this happens, but they also know that any rejection will impact their approval percentage. Approval percentage is one of the most widely used qualifications that requesters use to find qualified workers. The best and most experienced workers do their best to keep this number above 99%.
For a new worker on Mturk, a few rejections can impact their approval percentage greatly. For every one rejection, a worker has to have 100 approvals to get their approval percentage back above 99%.
Most experienced turkers are in the 99.5-99.9% approval range, so a single rejection means these workers will have to complete 500-1000 HITs to remove the effect of one rejection.
This is the workforce that you want working on your HITs. They have the highest accuracy and have usually completed a wide variety of different HITs that give them the experience you need to work on any task you have in mind. If you turn them off in the first few sentences of your instructions, you will not be pleased with your results.

Blocks- Blocks are a big no no. ONLY use blocks when you are positive that a worker was blatantly trying to cheat or steal from you. Amazon does not go out of their way to tell requesters how big of a deal blocks are. When a worker is blocked, this can trigger Amazon to do a review of a workers account.
This is the sequence of events when this happens -
Day 1 - Amazon receives notice of block on worker account
Day 2 - Worker account is automatically suspended (cannot work or earn money)
Day 3-7 or more - Amazon reviews account.
Week or two later - Amazon makes a decision to permanently ban worker or reinstate account.

This is possibly a week or two of not being able to work or earn a living. So the instructions state "Note: we will review all work and reject any work that we find unsatisfactory. Workers whose work is rejected will also be blocked" Although that is what is written, the worker sees this, "we could block you and you might lose your Mturk account."

I am not saying that a requester should pay for work that is not done properly. If you have clear instructions, have a well designed HIT, and a worker does not submit usable work, you are within your rights to reject. BUT, you are better off forming some sort of communication with your workers and publishing in small batches with your qualified workforce prior to rejecting dozens of HITs. If you are going to block, use blocks to rid Mturk of the scamming workers who try to undermine the workplace and not to punish honest workers who may have made a mistake.

Below is the worker's view of how this should be handled....

Used with permission of author

Wednesday, September 12, 2012

How to unreject assignments on Mechanical Turk

My name is Fredrick Brennan, most people know me as "frb" on TN and I'm going to be handling the technical posts on the requester blog. :) This is my first one, and I hope someone somewhere gets use out of it.

If you reject assignments on Mechanical Turk, you probably get workers asking you to unreject them moments after you choose to. If their complaints are valid, you can really feel like you're in a tough spot. The mTurk Requester site is pretty crippled, and it doesn't include this function by default. Luckily, there are two ways to unreject them (provided you have enough money in your account first):

First unrejection method: Easiest

Techlist's Dahn Tamir maintains an unrejection page here. However, this page has been known to go down. Further, as of September 2012, it does not properly run over HTTPS, meaning that it transmits a requester's details in plaintext. Therefore, I highly recommend you download it yourself and run it on your own server. Guides for this can be found for Apache, lighttpd, and nginx.

Second unrejection method: Harder, but possibly more rewarding

The second way to unreject is to install the API yourself (here I'll demonstrate with the Perl API) and call the ApproveRejectedAssignment operation.

Here's an example Perl script you can use to achieve this (you'll need the Net::Amazon::MechanicalTurk CPAN module, install it with cpan -f -i Net::Amazon::MechanicalTurk. The -f "force" flag is not optional, it really is that shoddily written.)

#!/usr/bin/perl
#Thanks to Tani Hosokawa for the script!
use Net::Amazon::MechanicalTurk;
use Getopt::Long;
use strict;
my $sandbox = 0;
my %turk_options;
my $assignmentId;
GetOptions("assignmentId:s" => \$assignmentId, "sandbox" => \$sandbox);
if (not $sandbox) {
$turk_options{serviceUrl} = 'https://mechanicalturk.amazonaws.com/?Service=AWSMechanicalTurkRequester';
}
my $mturk = Net::Amazon::MechanicalTurk->new(
%turk_options
);
my $result = $mturk->ApproveRejectedAssignment(
Operation => "ApproveRejectedAssignment",
AssignmentId => $assignmentId,
);

Then, you'll want to get a text file together with the ID's of the assignments you want to unreject, which should look something like this:

A2xxxxxxxxxx
A2xxxxxxxxxx
A2xxxxxxxxxx
A2xxxxxxxxxx
A2xxxxxxxxxx
A2xxxxxxxxxx

You can either modify the above Perl script or just use a simple Bash script to run it through the assignments. In either case, make sure you use dos2unix file.txt beforehand to turn the \r\n characters into just \n characters if you made the file in Windows.

The final script will look something like (run it with ./unreject.sh filename):

dos2unix "$@"
for line in `cat "$@"`
do
perl unreject.pl --assignmentID=$line
echo $line
done

If you got this far, you're done. :D

Tuesday, September 11, 2012

Requester Best Practices Guide

If you are a new requester and have never read the Amazon Best Practices Guide, you should take a look at it.
http://mturkpublic.s3.amazonaws.com/docs/MTURK_BP.pdf

I am not the biggest fan of the customer service that Amazon provides for requesters, but this guide offers a good starting point for new requesters. These are the basics for starting on Mturk and it really is in your best interest to read the entire PDF and understand what it means.

The reason I am posting right now is I just witnessed another new requester who's reputation is going down the tubes as I am writing. They committed many of the same errors that Lighting Buff did and in the last two hours I have watched their ranking drop like a rock.
They published over 15,000 hits in their very first batch, and waited over 6 days to approve/reject hits.
They are rejecting roughly 5% of all hits submitted and the negative reviews are flooding in. Hopefully this requester will be able to catch this in time, but they really needed to read this document and this blog prior to posting their first hits.

Wednesday, September 5, 2012

Tips for Academic Requesters on Mturk

Quick post for academic survey requesters on Amazon Mechanical Turk.

The comment below was made by a survey requester

It is not the worker's fault that you are not getting quality results. There are ways to get excellent results with your surveys on Mturk if you follow a few simple rules.

1. Set your "HITs approved" qualification to ABOVE 97%, NOT 95%. Mturk is work, not school. A, B, C, D, F does not apply here.
A workers are at approval rating of 99% and 98%.
B workers are at approval rating of 97% and 96%.
F workers are anything below that.
The vast majority of good turkers are above the 97 percent approval rating. It is very hard for any scamming worker to maintain an average above this percentage. There are two exceptions to this rule. Some new workers will work for the wrong requester and have their approval rating damaged with unjust rejections. Because new workers have fewer hits submitted, rejections influence their approval percentage more. The other exception is workers who are not in the United States. These workers are very limited by the hits they can accept and are forced to work for more requesters who use "Plurality" to grade hits. There is no way to avoid rejections when requesters use the ridiculous grading system of majority rules.

2. Set your "HITs submitted" qualification to over 5000. This goes hand in hand with the above comment. It is fairly easy to maintain a high approval percentage with fewer HITs submitted. Personally, if I was setting this qualification, I would use 10,000 HITs submitted, because it is almost impossible for a scamming worker to achieve over 10,000 hits submitted with over a 97 percent approval rate.

3. Use attention check questions in your surveys. If a worker is not reading the instructions for each answer, they will choose an incorrect answer. An example would be.....
"How are you feeling right now? Although we would like to know how you are feeling, please select "B" so we know you are paying attention."

4. Pay fairly. Academic requesters are notorious on Mturk for being some of the worst paying requesters. The ABSOLUTE MINIMUM pay you offer should be $0.10 per minute. This works out to $6.00 per hour. Although this is below the federal minimum wage, good turkers will accept surveys for this pay. The best academic requesters pay between $0.20-$0.30 per minute.
Mturk is not charity and it is not volunteer work. If you do not have the budget to work within these guidelines, you need to seek additional funding or do not publish your hits. It is highly unethical to publish hits that pay below minimum wage. If you search Mturk with the word "survey" you will see results from many unethical academic requesters who have no respect for the workers who complete their tasks. Also, because you see these tasks, it means they are not getting completed and are just sitting.

5. Treat your workers with respect and dignity. Workers are not numbers and statistics. Workers are not lab rats. Workers are people and should be treated with respect.

6. DO NOT BLOCK WORKERS! This is the standard response Amazon gives to academic requesters who require unique results for each questionnaire submitted. Reference post here IsaacM@AWS This is not the right way to ensure unique results, and had in the past been disastrous for workers. Make a qualification HIT for your survey. After the worker completes the survey, revoke the qualification. If blocks are not done correctly, Amazon will shut down a workers account for review.

As always, you can contact me on TurkerNation for help.