Category theory in theoretical linguistics: A monadic semantics for root syntax

ACT2022 e-poster (July 18–22, 2022)

Chenchen (Julio) Song, cjs021 AT zju.edu.cn
School of International Studies, Zhejiang University

❶ In a nutshell

Learn more from my 4-min video or blogpost.

❷ Root syntax & its generalization

Two major incarnations (I use DM):

Mainly for decomposition of content words:

Generalized root syntax (Song 2019):

❸ Examples of semigrammaticality

Ex.3: Chinese classifiers [Cl\(\lambda P \lambda x. x \in \mathrm{Atom}(P)\)]

a. 'grip' (objects with handle-like bars), běn 'volume' (bound print matter), dòng 'pillar' (buildings), miàn 'surface' (flat objects), etc.

b. wèi/míng/gè lǎoshī  'one Clr/o/n teacher'
(r = respectful, o = official, n = neutral)


Ex.4: Vietnamese negators [Neg\(\lambda t. \neg t\)]

a. không 'empty' (default), đâu 'where' (emphatic, colloquial), nào 'which' (colloquial but elevated), đếch 'fuck' (mildly vulgar), đéo 'penis, fuck' (very vulgar), etc.

b. Em không cần anh giúp.
'I Negn need your help.' (n = neutral)

c. Tao đéo cần mày giúp.
'Iv Negv need yourv help.' (v = vulgar)

❹ Monadic semantics

Point of departure: formal semantics for generative grammar (Heim & Kratzer 1998)

Def. 1 (cf. Asudeh & Giorgolo 2020):
Let \(\langle T, \eta, \mu\rangle\) be a monad on \(\mathbf{Sem}\), such that \(\forall A. TA = \langle A, \{\langle X, \surd_1\rangle, \langle Y, \surd_2\rangle, \dots \} \rangle\), where \(\langle X, \surd_1\rangle\) etc. record the root-supported types in the syntactic structure denoting \(A\). Then \(\forall f: A \rightarrow B. Tf = \lambda\langle x, Q\rangle.\langle f(x), Q\rangle\), where \(Q\) is also a set of type-root pairs. The two natural transformations are \(\eta_A = \langle x, \emptyset\rangle\) and \(\mu_A(\langle x, P\rangle, Q\rangle) = \langle x, P\cup Q\rangle\). With \(\mu\), we can further define >>= on \(ta: TA\) and \(f: A \rightarrow TB\) as \(ta\) >>= \(f = \mu_B(Tf(ta))\).
Remark 1: The set of grammatical type–root pairs serves as a record of the root support situation in an expression. The log set is "inert" in composition and only gets "opened" at the final stage of semantic interpretation.

We complete the monadic semantics with the ancillary function \(\mathrm{write}\) (A&G2020), which wraps a grammatical type–root pair into a dummy monadic term:

Def. 2: \(\mathrm{write}\langle\)X, √\(\rangle\) = \(\langle\)1, \(\{\langle\)X, √\(\rangle\}\rangle\).

Together with >>=, this gives us a way to compose the root categorization schema:

Def. 3: ⟦[X X √ ]⟧ = \(\mathrm{write}\)(X, √) >>= \(\lambda y. \eta\)(⟦X⟧)

This writes a grammatical type–root pair into the log slot of a vacuous monadic term.

❺ Examples of monadic composition

Ex.5: the English noun dog

⟦dog⟧ = ⟦[N n √DOG]⟧ = \(\mathrm{write}\)(n, √DOG) >>= \(\lambda y. \eta\)(⟦n⟧) = \(\langle\)⟦n⟧, \(\{\langle\)⟦n⟧, √DOG\(\rangle\}\rangle\)
(an entity enriched by √DOG)

Ex.6: the Vietnamese negator đéo

⟦đéo⟧ = ⟦[Neg Neg √ĐÉO]⟧ = \(\mathrm{write}\)(Neg,√ĐÉO) >>= \(\lambda y. \eta\)(⟦Neg⟧) = \(\langle\)⟦Neg⟧, \(\{\langle\)⟦Neg⟧, √ĐÉO\(\rangle\}\rangle\)
(a boolean function enriched by √ĐÉO)

See Song (2021, 2022) for larger examples.

❻ Categorical setting

The category \(\mathbf{Sem}\), being a subcategory of \(\mathbf{Set}\), is cartesian closed. We do not need extra structures, such as left/right directionality, because word order is not regulated by syntax/semantics in Chomskyan linguistics (but is done in phonology).

Def. 1 is given from the perspective of semantics. We could also start from the syntactic side, but it is unclear whether Chomskyan syntax defines a category. Thus, focusing on the semantic side seems to be the "shortest path" for us at this stage.

Despite the name \(\mathbf{Sem}\), the category in Def. 1 is actually more like the syntax (i.e., formal calculus) category in other categorical linguistic works, since its objects/morphisms can also be viewed as types/terms. Due to its Montogavian foundation and \(\lambda\)-calculus implementation, formal semantics as it is practiced in generative grammar is still quite syntax-y from a categorical perspective.

❼ Conclusion