Compounding and the generative lexicon - lecture notes
Anders Soegaard
University of Copenhagen
[To some extent, it is fair to say that language users speak matter. But matter is spoken within the limits of human conditions. That compounding constructions are possible forms in most of the world's languages relates to human conditions. On the other hand, the fact that such constructions are employed and that they serve a variety of functions relates to matter and our conceptualization of it. The compounding model to be presented here is ontological in the sense that every attribute to be assigned a value in the interpretation of a compound is rooted in ontological notions of matter and mind. But human conditions are also admitted to place certain constraints on the design of the model and its formal architecture.]
It is commonly accepted that to interpret a two-constituent N-N compound, alpha-beta, one must know the meaning of alpha, and the meaning of beta, as well as their semantic relationship. Though this relationship sometimes is rather indirect, e.g., alpha may metonymically refer to something, which is related to beta.
It is also commonly accepted today that the set of possible compounding relationships is infinite. Thus, what I call REDUCTIONIST THEORIES in the hand-out - theories, which rely on the enumeration of possible compounding relationships - are by definition inadequate. TRANSFORMATIONAL THEORIES AND PRAGMATIC THEORIES respectively argue that compounding relationships are derived from underlying relative clauses or from pragmatic knowledge about the world. Both theories thus account for the infiniteness of compounding relationships. But in the handout, I point out the shortcomings of such theories.
The approach to compounding I deal with today is the so-called SLOT-FILLER APPROACH. And more specifically, the generative lexicon version of this approach.
In slot-filler approaches to compounding, the constituents, alpha and beta, are conceptualized as (structured) bundles of features, and the modifying constituent simply "adds" a feature to the other constituent in the compounding process. On page 2 of the handout, I claim that current slot-filler theories face serious challenges:
Firstly, that even in endocentric compounds, modifying constituents very often add more than one feature.
And secondly - and this might seem a more serious challenge to contemporary theories - that exocentric compounding is productive in many languages (Soegaard, 2003). Though probably less productive than endocentric compounding, it needs to be accounted for.
Thirdly, some languages exhibit more open and symmetric constructional templates than the ones slot-filler theories typically account for.
As mentioned, the standard GL approach to compounding is basically a slot-filler approach. In GL, composition in compounds is conceived of as "specification of one of the semantic components within the qualia of the head noun" (Johnston and Busa, 1999).
Johnston and Busa analyze compounds derived by specification of juice. Here, juice is the main bundle of features, and the premodifier just fills the slot of the second argument of the head noun's agentive role, made_from. That is, orange juice means juice made from orange.
In the case of lemon juice, the agentive role of the head noun juice specifies a creation event which, for convenience, we express as the relation made_from. Such a relation holds between juice and a natural kind-entity such as fruit or vegetable. Thus, in the compound lemon juice the modifying noun lemon is to further subtype this component of the meaning of juice. This is possible because lemon is a subtype of fruit. (Johnston and Busa, 1999:175)
By doing so, Johnston and Busa account for the majority of juice-compounds, but I believe it is important also to stress that they exclude compounds - such as pan juice, cooking juice, and stomach juice. As a consequence, all such compounds must be labeled opaque and idiosyncratic, and they are to be explained only by semantic drift. This is of course unattractive, since such patterns may be or become productive.
To improve the GL model and to make it account for conceptual exocentricity, I propose a construction hierarchy comprising a handful of distinct, but interrelated levels of more or less schematic constructions. Each level is loosely associated with one or more feature label.
For the sake of brevity, I ignore three attributes in this presentation: category, argument structure, and event structure. Category - or cat - is explained in the hand-out. All 3 are well-known. Other traditional attributes are qualia, context and orthography. 3 more controversial - or at least, untraditional - attributes employed by this model are concept, taxon and probability. I will spend some time on explaining their contribution to the overall architecture of the model. My use of qualia is traditional, so I won't discuss it. But since my use of context and orthography is slightly unconventional, I briefly explain these notions.
The concept attribute specifies conceptual types. The conceptual typology of compounding is based on a distinction between literal, metonymical, and metaphorical relatedness. There is no clear-cut distinction between literal and metonymical relatedness, whereas the distinction between metonymical and metaphorical relatedness is sharp. However, a compound may have two or more interpretations, in some of which the modifier's conceptual status changes.
A two-constituent compound, alpha-beta, consists of at least one pointer and at most one modifier. The more traditional notion of "head" is misguiding, since it does not allow for gradience between literal and metonymical relatedness, and since it does not distinguish between grammatical and conceptual percolation. In this typology, a compound inherits its profile or conceptual structure (or parts of it) from the pointer (or parts of it). Whereas a head entails literal relatedness, literal relatedness entails a pointer - not the other way around. In other words, there exists metonymical and metaphorical pointers.
There are 10 "first-level" conceptual types, i.e., non-extended constructions. One example is:
8. [M(m2)-P(m1)]: Exocentric compounds with metaphorical left constituent modifiers, such as duckbill and shovelhead.
Note that since compounds are by default symmetric until they are linked to conceptual type constructions, non-conservative modifier-pointer distribution does not pose a problem to our model. Such distribution is exemplified in (1) and (2):
Conceptual types can be largely derived from taxon and context information. This is important for processing.
Regarding the taxon attribute: In the handout I point out that what is traditionally called "taxonomies" includes a variety of different lexical inheritance networks. This is of course reminiscent of Anna Wierzbicka's work, but it is important to stress here, since different lexical inheritance networks relate differently to qualia roles.
Taxonomy specifications are relevant, though maybe not strictly necessary, in the study of compounding, since many compounding constructions specify for taxonomic levels. According to the STO corpus of Danish written language, the frequency of generic-level modifiers in compounds derived from the analogy base ____-gård [Ø-yard] such as andegård [duck-yard] and hundegård [dog-yard] is twice or three times as high as that of life form-level modifiers as in dyregård [animal-yard]. Other compounding constructions such as ____-bod [Ø-stall] only select for upper-level modifiers.
Taxonomic specification largely results from Gricean pragmatics. However, pragmatics can't handle all cases, e.g., the restrictions on slot-filling in ____-bod [Ø-stall].
The probability attribute was proposed in Copestake and Lascarides (1997) to resolve ambiguous lexicalized compounds. It assigns a weight to every lexical item, including constructions. Probability is relevant to interpretation as long as no coherence constraints are violated. In this paper, I tentatively propose that every (qualia) feature be assigned a weight such that, for example, default metaphorical projection triggers specific features: In metaphoric use, the agentive and telic roles of mother are the more likely to be projected; whereas constitutive roles frequently are projected for metaphorical interpretations of legs. Such information is, I believe, important to interpret novel compounds. Qualia feature weights can be extracted from corpora through collocation studies, since different syntactic surroundings utilize different qualia features (Pustejovsky, 1991). Probability values for lexical items are useful in dot object and homonymy resolution. In the attribute-value matrixes below, however, probability values are not specified, for simplicity.
In the fourth part of the handout, I view context as a constantly up-dated function, but it is commonly accepted that there are also prototypical contexts or situations. In this model, situational information are provided by lexical contributions to discourse structure. Thus, the context value only specifies the semantic domain. Domain inconsistencies license metaphorical constructions.
Some constructions have partially or wholly specified orthography values. (3) in the handout highlights the need for such lexical item specific constructions. Though sea-legs is probably fully lexicalized, nonce compounds too provide irregular analogy bases for novel compounds, as well as for the reinterpretation of blocked compounds.
In the handout I point out how cognitive semiotics provides a useful framework for testing formal analyses of compounds. For brevity, I won't discuss it here, but the lexical representation of (3) depends on the semiotic analysis.
The lexical representation of a compounding construction is pretty straight-forward. (4) represents the attribute-value matrix for sea-legs. It is clear that conference legs is not fully analogous to sea-legs; in fact, the differences, i.e. the taxon and context values of the modifier constituents, are what makes conference legs slightly humorous - and ponderous. (Poetic effects often stem from taxon and context shifts.) (In the graph, structure sharing is indicated by numerals, i.e., tags.)
ORTH=sea-legs
QUALIA_FORMAL = u:limb
QUALIA_CONSTITUTIVE = part_of(u,w:human), part_of(z:knee,u)
QUALIA_TELIC = walk(w)
QUALIA_AGENTIVE = result_from(u,be_at(w,1))
TAXON = part_human_body - l2
CONTEXT = c2
cn(1)
CONCEPT_M(m1)_ORTH = sea
CONCEPT_M(m1)_QUALIA = ...
CONCEPT_M(m1)_TAXON = taxon1 - l1
CONCEPT_M(m1)_CONTEXT = c2
cn(2)
CONCEPT_P(l)_ORTH = legs
CONCEPT_P(l)_QUALIA_FORMAL = y:limb
CONCEPT_P(l)_QUALIA_CONSTITUTIVE = part_of(x,y:human), part_of(z:knee,x)
CONCEPT_P(l)_QUALIA_TELIC = walk(y)
CONCEPT_P(l)_TAXON = taxon_human_body - l2
CONCEPT_P(l)_CONTEXT = c
Note that composition in (4) is not just "specification of one of the semantic components within the qualia of the head noun". The construction provides the agentive role on its own. Also, this is a case of multiple feature exchange, since the modifier contributes with a context value too.
There is another example in the handout, a Mandarin Chinese copulative compound, which unfortunately I don't have the time to discuss.
Our model of N-N compounding is universalist. The conceptual types are the result of typological studies. Taxonomies and pseudo-taxonomic inheritance hierarchies are also universal architectures according to cultural anthropology; though specific nodes differ from culture to culture, and what is a generic term in one language might be a specific term in another. Since qualia structure is shared knowledge (or knowledge strategies) retrieved for lexical purposes, it is by definition structurally culture-independent (though its content of course varies). In sum, languages might employ different parts of the proposed construction hierarchy, but in it, there is a matching construction for all novel N-N compounds, whether or not it is a highly specific construction (for classification of compounds in Danish, see appendix, in which I classify a small sample of compounds).
Regarding the interpretation of compounds, our model seems relatively complex, since we allow for multiple compounding constructions. Our model is still compositional, but there are many possible input constructions to match the input constituents. That is, many mappings are allowed from different parts of the representational framework.
One might ask if such a model is too complicated? Two important factors radically reduce the complexity of the model: (a) The constructions are ordered hierarchically and thus allow for fast top-down processing. In processing a novel compound, it is not necessary to choose between all possible constructions in the hierarchy; only between daughter constructions at relevant nodes. (b) Some languages (such as English) might lexicalize default constructions (Copestake and Briscoe, 1995).
I briefly sketch the principles of compound interpretation in the hand-out. It is clear, how the interpretation of compounds in this model is construction-dependent, context-dependent, and how multiple feature exchange is preferred. As proposed by Copestake and Lascarides (1997), compound interpretation is taken to be a matter of optimality calculations. Here, it is a matter of the probability of the qualia features of the constituents - in relation to possible constructions, i.e. constructions that cohere with the constituents, and the specific context (or discourse structure), rather than just the probability of whole lexical items.
Our account, however, leaves some questions unanswered. The two most important questions might be: a) What is the precise nature of the context function (is it a blocking-only function, or does it assign a value to every possible interpretation)? b) How is the possible set of interpretations defined? This has of course to do with the possible existence of "default constructions".
In sum, since both endocentric and exocentric compounding are productive, the semantics of compounding is not explained by listing every single compound. Nor is it explained by enumerating semantic relationships; or simply by passing the job on to trash can pragmatics. Rather, the meaning of a compound should be compositionally derived from the input constituents, a more or less schematic construction, and contextual information. In dealing with compounding (and to explain semi-systematic ambiguities) it thus seems crucial to allow for multiple compounding constructions. Since different compounding constructions possibly apply in the interpretation of a compound, the semantic output is a result of relatively complex optimality calculations. Fortunately, complexity is reduced by the architecture of the construction hierarchy (and, maybe, by "default constructions").
[Let me finally attempt to relate this paper to the title of the workshop and point out to you that most of our model can be described informally in terms of commonsense ontology. Let me remind you that qualia is the cornerstone of Aristotelian ontology. Taxonomies are intended as models of nature. The notion of metonymy is based on two distinct ontological relationships, namely, causation and part-whole-relationships. Metaphor is based on ontological analogy. Semantic domains are neither just arbitrary mental constructs; rather, they mirror the world and our representation of it. I do not wish to make any strong, general philosophical claims about the nature of language; but it seems to me that these findings might (also) be taken to suggest a very intimate relationship between ontological knowledge and linguistic coding.]