Some brief thoughts
The question (and variations on the question):
"Where is the evidence that digital games and worlds are useful in learning situations?"
...is not something that can be briefly and satisfactorily answered e.g. "Here it is!" But it's a question that has been asked for several years, not just in university or school-based education. For example, the recent cycle of interest in Gamification has prompted many in the academic and business communities to ask:
"How can I build a game into this process that people have to do to make it 'better'?"
It's the issue of what 'better' actually is, and proving that introducing some kind of game will make it 'better' that is troublesome.
The opening of the Wikipedia entry for evidence-based practice, as of August 11th 2012, is a good starting point:
Evidence-based practice (EBP) is an interdisciplinary approach gaining ground after 1992. It started in medicine as evidence-based medicine (EBM) and spread to other fields such as nursing, psychology, education, library and information science and other fields. Its basic principles are that all practical decisions made should:
1) be based on research studies and
2) that these research studies are selected and interpreted according to some specific norms characteristic for EBP.
Typically such norms disregard theoretical studies and qualitative studies and consider quantitative studies according to a narrow set of criteria of what counts as evidence. If such a narrow set of methodological criteria are not applied, it is better instead just to speak of research based practice.
This is as applicable to digital games and worlds in education as it is to the subject domains mentioned in the Wikipedia entry. Having said that, it is not as straightforward as obtaining evidence in more quantitative areas. Much of the research output in the 'games in education' field is qualitative in nature; interviews and debriefs with students; chat logs from virtual world sessions; evaluation forms.
Yes, some results from digital game or world use can be quantitively measured; test scores, performance times, success percentages, and information retention rates, for example. Sometimes this data is superficial, but relatively easy to collect, especially when it is one-off measurements and not longitudinal in nature. But other attributes resulting from digital game or world use, such as permanent changes in emotional states (can a person be e.g. "21 percent more confident"?), the long-term increase in the literacy or communication ability of a student, or an improved cognitive ability in analyzing work situations, are more difficult, costly, and time-consuming to measure.
Hence, there isn't as much 'deep' quantitative research data as some would prefer, and much of the existing data is qualitative in nature. So more often that not, accurately answering a specific question along the lines of:
"Can I use a digital game to improve learning in this situation?"
...will result in a reply of:
"Possibly, but never definitely; here is the data from studies of similar situations, and here are some guidelines and recommendations from other situations which will increase the chances of your game use being successful."
In other words; even with an evidence-based approach, there are rarely easy or simplistic, and risk-free, digital game* or world-based solutions to a learning or process requirement. The evidence-based solution is likely to be a combination of various quantitative and qualitative data and results, along with guidelines from previous situations, experiments and research.
I've been collecting academic papers, articles, books and other ephemera on digital games and virtual world in education since the late 1990s. This collection has become useful when dealing with previous clients and project partners. Adding more relevant ephemera to it is essential, to stay up with recent research and to provide contemporary analysis for people I'm working with.
The emergence of open access to research literature, and the opportunities provided by blogging, presentation repositories and social media, have increased the amount of ephemera out there. This is not wholly a good thing. It's difficult to keep up with what is now a torrent of writing, across various media, in the research field. Worse, it makes the identification of genuine, and genuinely useful, evidence and data more difficult and time consuming. It's easier to find a needle in a bale of hay than in a haystack.
Hence this summer I've come round to changing the focus of my personal collection to be a more explicitly evidence-based collection. This fits in with a more evidence-based approach to work for clients, and also forces a more stringent set of criteria on what I look for and what gets included. Essentially - looking at each candidate item and deciding if it contains valid and relevant evidence of digital game or world use (or attempted use) in a learning situation.
* game design itself is a very uncertain science, as exhibited by the large development costs and the failure of most commercial games to be successful