Research
Information Technology | Biotechnology | Transportation & Logistics |
Materials Science & Engineering | Construction Engineering & Management |
System Engineering & Management | Energy & Environment

Automated Analysis and Information Extraction of Text Documents

There is a lot of scope for development in the area of information extraction from text documents. On aspect which I have been very interested in is the automated detection of plagiarism or ``similarity" between documents. My approach has been to modify and apply suitable methods drawn from the field of bioinformatics to the analysis of collections of text documents. An additional benefit is the ability to visualise and identify trends and relationships in these collections, a capability which may have many applications such as in the automatic versioning of source code.

Principal Investigators

Projects

  • A Self-Organising Approach to Document Space Visualisation
  • SNITCH: An open-source software package for document analysis and detection of plagiarism



  Back