We recently got another research paper on our work in S-CASE accepted at a conference on natural language processing. The accepted paper describes our efforts on improving a parsing model that can automatically map software requirements written in natural language to formal representations based on semantic roles.
State-of-the-art semantic role labelling systems require large annotated corpora to achieve full performance. Unfortunately, such corpora are expensive to produce and often do not generalise well across domains. Even in domain, errors are often made where syntactic information does not provide sufficient cues. In this paper, we mitigate both of these problems by employing distributional word representations gathered from unlabelled data. The rationale for this approach lies in the so-called distributional hypothesis by Zellig Harris, which states that words that occur in the same contexts tend to have similar meanings.
While straight-forward word representations of predicates and arguments have already been shown to be useful for semantic analysis tasks, we show that further gains can be achieved by composing representations that model the interaction between predicate and argument, and capture full argument spans.