Mittwoch, 13. Mai 2009

Nonsense poetry

This post at languagehat discusses attempts to determine whether an unknown script in fact represents a language using statistical analysis. The principle seems to be that language will show a higher probability for certain symbols to follow certain other symbols, for example, in English we can predict that 'q' will be followed by 'u', and certain words ('the,' 'and') will appear more frequently than others.

Several semesters ago I did a project on the semantics of nonsense poetry, and this reminded me of a couple of questions I considered at the time, although I didn't have the necessary background to delve any deeper into the issue.

Among other things I was interested in whether we could distinguish sound poetry from poetry in an unknown (or made-up, Jabberwocky-type) language based on formal characteristics. Although a sound poem may be divided into units resembling words, we have no way of determining how these words may relate to each other (syntax and morphology). All sounds are equally important. All patterning is the result of formal resemblances, not meaning imposed arbitrarily on a phonetic structure. Sound is used as material, as in painting or music. Although sound poetry may resemble a text in an unknown language in that both consist of strings of syllables which have (for the audience) no meaning, one would expect them to differ structurally, because the principles for arranging the sounds are different.

(I assumed here that music is patterned differently than language -- this claim could probably be challenged, certainly the separation between music and language is by no means absolute, I don't know enough about this take it further. This also does not consider that the arrangement of words in poetry in particular is based on considerations of sound and meter at least as much as on meaning. Even so, intuitively it seems to me there must be some difference, in degree if not in kind.)

Theoretically, formal characteristics should also be sufficient to distinguish nonsense from code, where the meaning can be found by systematically applying rules to produce the text in the original language. However, there may in fact be a close resemblance between a nonsense text and something in an unknown language, which is why the discussion of statistical methods for attempting to decipher this script are interesting for my purposes.

One of the commentors links to a letter to the editor in the satirical linguistics journal Speculative Grammarian:

Carapes the ditl isch prentele whic che fiene Unincip-ikedfuls Que pland trial laing expror, no the thent acards, wal of of Eng Evis, forigh Worics on ousunt heard In youle not to linet med, mants of sen gic spers of at nam at mands wouremay.

“For efillyin froccut werepty to oreings; thicialy, sualich.” Goverphose blit.

He adds the following explanation:
This text meets the statistical specification of English at the trigram level (3-letter combinations, rather than bigrams, or 2-letter combinations), so if you use any of the statistical language identifiers out there on the web, it will usually most strongly identify as English, though clearly it is not...The interesting thing to me is that if you didn't know it was fake text, it feels vaguely like it could be a Germanic language or a not-so-Romance Romance language (like the way Romanian has been heavily influenced by contact with Slavic languages). Since Germanic + a-bit-of-Romance is a fair characterization of English, you can see why the stats are fooled.
I was intrigued by the text first, of course, because of its resemblance to nonsense texts. One frequently observed characteristic of poems such as "Jabberwocky" is that the nonsense words are all phonetically well-formed -- they look like possible words in the language they are based on. There are also traces of what could be interpreted as suffixes, function words, etc. (Perhaps there's some thinly-disguised meaning here which I'm just missing, and the text could be "translated" quite easily. I'm going to assume based on the writer's comments, however, that this was not the intent in this case...)

I'm not sure whether his fake text challenges or supports my theory, however. It does seem to suggest that syntax is irrelevant for this type of analysis, and part of my argument was that morpho-syntactic structures are part of what we use to interpret nonsense texts.

Keine Kommentare:

Kommentar veröffentlichen