Registry
Module Specifications
Archived Version 2019 - 2020
| |||||||||||||||||||||||||||||||||||||
Description | |||||||||||||||||||||||||||||||||||||
This course introduces the fundamentals of statistical machine translation. | |||||||||||||||||||||||||||||||||||||
Learning Outcomes | |||||||||||||||||||||||||||||||||||||
1. Discuss the challenges associated with machine translation 2. Explain the noisy channel model underpinning statistical machine translation 3. Demonstrate how a statistical translation model can be inferred from a parallel corpus of texts using unsupervised machine learning techniques 4. Explain the concept of statistical language modelling and how it fits in to the basic SMT architecture 5. Explain the concept of decoding and be in a position to implement a beam decoder 6. Evaluate a statistical machine translation system using at least one automatic metric 7. Demonstrate a knowledge of the state-of-the-art in statistical machine translation 8. Train, test and evaluate MT systems using the open-source Moses toolkit 9. Implement a language modeller (including smoothing) and a basic word aligner | |||||||||||||||||||||||||||||||||||||
All module information is indicative and subject to change. For further information,students are advised to refer to the University's Marks and Standards and Programme Specific Regulations at: http://www.dcu.ie/registry/examinations/index.shtml |
|||||||||||||||||||||||||||||||||||||
Indicative Content and Learning Activities | |||||||||||||||||||||||||||||||||||||
Noisy Channel Model of Statistical Machine TranslationThe noisy channel model and its link to Bayes TheoremEvaluating SMT systemsThe relative advantages and disadvantages of human evaluation, automatic evaluation and task-based evaluation. The BLEU evaluation metricLanguage ModellingThe role of language modelling in SMT. The importance of smoothing in language modellingTranslation ModelsLearning a word-based translation model from a parallel corpus using Expectation Maximization. Deriving a phrase-based model from a word-based model.The relative strengths and weaknesses of various modelsDecodingA beam search decoding algorithm for SMT. Techniques for pruning the search space.Encoding Linguistic Information in an SMT systemTechniques for including morphological, syntactic and semantic knowledge in an SMT system | |||||||||||||||||||||||||||||||||||||
| |||||||||||||||||||||||||||||||||||||
Indicative Reading List | |||||||||||||||||||||||||||||||||||||
| |||||||||||||||||||||||||||||||||||||
Other Resources | |||||||||||||||||||||||||||||||||||||
None | |||||||||||||||||||||||||||||||||||||
Programme or List of Programmes | |||||||||||||||||||||||||||||||||||||
Timetable this semester: Timetable for | |||||||||||||||||||||||||||||||||||||
Archives: |
|