Corpus Linguistics
Last update: December 15th, 2010
Tutorials
- Nadja Nesselhauf's web tutorial
- Corpus Linguistics (McEnery/Wilson) - companion website
- The Use of Corpora in Language Studies (McEnery/Wilson)
Overview of corpora available to date
-
Well-known and influential corpora (Richard Z. Xiao)
English corpora online
- BYU CORPUS COLLECTION which includes:
- »TIME Magazine Corpus of American English (100 million words, 1920s - 2000s)
- »Corpus of Contemporary American English (COCA, 385 million words from 1990 - present)
- »BYU-BNC (British National Corpus, BNC, 100 million words, 1980s - 1993)
- »Oxford English Dictionary corpus (37 million words, c900s - 2000s)
- Infosite for the British National Corpus
- BNCweb interface
- American National Corpus
- Michigan Corpus of Academic Spoken English (MICASE)
- Michigan Corpus of Upper-Level Student Papers (MICUSP)
- WebCorp: The Web as Corpus
- The International Corpus of English Homepage
- Collins WordbanksOnline (The Bank of English)
- The CHILDES System
German corpora online
- Liste deutschsprachiger Korpora
- DWDS Corpus
- Textkorpora des IDS - Übersicht
- COSMAS II (IDS Mannheim)
- NEGRA Corpus
- Wortschatz Deutsch
Parallel / Translation corpora (German-English or English-German)
- Parallel German-English Corpus "de-news", Daily News 1996-2000
- English/German translation corpus (TU Chemnitz)
- German(-English) parallel corpora (Europarl and German News)
- OPUS - an open source parallel corpus
Other languages
- French
- Spanish
- Portuguese
- Polish
Learner Corpora
- International Corpus of Learner English and other learner corpora
- Louvain International Database of Spoken English Interlanguage
- Cambridge Learners Corpus
- Corpus of Academic Learner English (CALE)
- Corpora 4 Learning - a website providing links and references for the use of corpora, corpus linguistics and corpus analysis in the context of language learning and teaching
Diachronic corpora
Link collections
- Michael Barlow's page on "Text Corpora and Corpus Linguistics"
- David Lee's Bookmarks for Corpus-based Linguists
(Web-based) tools
- ParaConc - a bilingual/multilingual concordancer that can be used in contrastive analyses, language learning, and translation studies/training (free demo available)
-
- Phrases in English (PIE)
- SketchEngine
- GlossaNet
Institutions
- UNIVERSITY CENTRE FOR COMPUTER CORPUS RESEARCH ON LANGUAGE UCREL), Lancaster UK.
- International Computer Archive of Modern and Medieval English (ICAME)
- Centre for Corpus Linguistics, University of Birmingham
- Centre for English Corpus Linguistics, University of Louvain, Belgium
- HUB Korpuslinguistik
- Corpora List
Journals
- Language Learning & Technology
- International Journal of Corpus Linguistics
- Corpus Linguistics and Linguistic Theory
- Corpora
- ICAME Journal
- Literary and Linguistic Computing