Master's Thesis

Master's Thesis (in German)

Parsing and Generation within uniform Architectures


Abstract

The thesis mainly compares some approaches for the integration of parsing and generation within a uniform architecture and gives a Prolog implementation of the most uniform one. Additionally some suggestions for future research directions are made.

Uniform architectures commonly are defined as a reversible grammar plus a single processing component. Although every declarative grammar should be independent of a specific processing direction, often grammars are only used for parsing and are not well-suited for generation. Besides issues of simplicity and elegance the use of one processing component allows for interleaving parsing and generation. Therefore it is possible to modell phenomena like monitoring (controlling the generation process with the parser; cf. [Neumann 1994]).

After defining some basic notions like reversibility, completeness and coherence of parsing and generation, pre-, post-, and in-order tree traversal (corresponding to top-down, bottom-up and left/head-corner analysis), control strategies (breath-first, depth-first), handling of alternatives (backtracking versus tabulation/chart) some approaches for parsing and generation are investigated with respect to these dimensions:


  • Van Noord's head-driven algorithms: head-corner parsing and semantic-head-driven generation ([van Noord 1993], [Shieber et al. 1990], [van Noord 1997]);
  • The first proposal of [Shieber 88] to use Earley deduction for parsing and generation;
  • The use of a head-driven selection strategy for generation in the context of Earley deduction of [Gerdemann 1991];
  • The Uniform Tabular Algorithm (UTA) of [Neumann 1994], which for the first time truely integrates parsing and generation;
  • Bottom-up Earley deduction of [Erbach 1995].
  • The following table gives an overview of the approaches mentioned above:

    van Noord [1993]

    Shieber [1988]

    Gerdemann [1991]

    Neumann [1994]

    Erbach [1995]

    Treatment of alternatives

    backtracking

    chart

    chart

    chart

    chart

    Control

    only depth-first

    depth-first to breath-first

    only breath-first

    depth-first to breath-first

    depth-first to breath-first

    Processing direction

    bottom-up

    top-down

    top-down

    top-down

    bottom-up

    Selection function: Parsing

    syntactic head

    leftmost element

    leftmost element

    leftmost element

    --

    Selection function: Generation

    semantic head

    leftmost element

    semantic head

    semantic head

    --

    Indexing: Parsing

    not necessary

    string position

    string position

    string

    string position

    Indexing: Generation

    not necessary

    --

    string position

    semantics

    semantics

    Uniformity

    +/-

    +

    -

    +++

    ++



    The last row referring to the uniformity of parsing and generation should not be taken too seriously: it seems that the notion of `uniformity' has not yet been precisely defined. I propose to define it as a uniform analysis strategy: a uniform architecture then consists of a reversible grammar and a single processing component which performs only one kind of tree traversal. This is especially useful in case of arbitrarily underspecified input feature structures (e.g. with part of the string and of the semantics or pragmatics instantiated) the task of the processing component cannot be determined as `pure generation' or `pure parsing'. If different analysis strategies are used for parsing and generation it is hard to decide which one to use. A uniform analysis strategy avoids this problem. Since head-driven algorithms perform both tasks equally efficient a uniform analysis strategy should select the head element of a grammar rule first.


    Listing of Prolog files for SICStus and Quintus Prolog

    Load file

  • top.pl


  • Semantic-head-driven generation
  • head.pl
  • memo_head.pl

    Top-down Earley-deduction
  • chart.pl
  • aux.pl

    Example grammar
  • grammar.pl


  • Interpreter tests
  • test_suite.pl
  • tests.pl
  • examples.pl

    Termination problem of left-corner parser
  • lc.pl

  • References

    @PhdThesis{erb95,
      author =       "Gregor Erbach",
      title =        "Bottom-Up Earley Deduction for Preference-Driven Natural Language Processing",
      school =       "Universit{\"a}t des Saarlandes, Saarbr{\"u}cken.
                      Draft of 31.8.1995.",
      year =         "1995"
    }
    
    @PhdThesis{gerde91,
      author =       "Dale Gerdemann",
      title =        "Parsing and Generation of Unification Grammars",
      school =       "University of Illinois",
      year =         "1991"
    }
    
    @PhdThesis{noord93,
      author =       "Gertjan van Noord",
      title =        "Reversibility in Natural Language Processing",
      school =       "University of Utrecht, 1993",
      year =         "1993"
    }
    
    @PhdThesis{neu94,
      author =       "G{\"u}nter Neumann",
      title =        "A Uniform Computational Model for Natural Language Parsing and Generation",
      school =       "Universit{\"a}t des Saarlandes, Saarbr{\"u}cken",
      year =         "1994"
    }
    
    @Article{noord97,
      author =       "Gertjan van Noord",
      title =        "An efficient {I}mplementation of the head-corner-{P}arser",
      year =         1997,
      journal =      "{\rm to appear in} Computational Linguistics"
    }
    
    @InProceedings{shieb88,
      author =       "Stuart M. Shieber",
      title =        "A {U}niform {A}rchitecture for {P}arsing and {G}eneration",
      pages =        "S.614-619",
      booktitle =    "Proceedings of the 12th International Conference on Computational Linguistics",
      year =         "1988"
    }
    
    @Article{shieb90,
      author =       "Stuart M. Shieber and Gertjan van Noord and  Fernando C.N. Pereira and R.C. Moore.",
      title =        "Semantic-head-driven {G}eneration",
      volume =       "16",
      number =       "1",
      pages =        "S.30-42",
      journal =      "Computational Linguistics",
      year =         "1990"
    }