This talk consists of two parts. In the first part, I will present a walk through Gabmap, a popular web-based application designed for automated analysis and visualization of dialect data. In this part, we will follow a typical example study step by step. A Dialectometric study typically involves using aggregated information from many items, or features, to investigate dialectal differences. In the second part of the talk, I will present a relatively less-studied problem in automated analysis of dialects. The problem we tackle in this part is finding the characteristic features of a given dialect.
In particular, we will focus on finding words (that we call ‘shibboleths’) whose pronunciations in a given dialect group or area is distinctive but consistent with respect to the other dialects in the area of interest.
Jeroen van Craenenbroeck:
This talk is situated at the intersection of quantitative and qualitative linguistics. It uses quantitative-statistical methods to further our theoretical understanding of variation in verb cluster ordering in Dutch dialects. In so doing, it harnasses and combines the strenghts of both approaches: quantitative linguistics has sophisticated means of dealing with large and highly varied data sets, while hypotheses and analyses from qualitative linguistics can be used to guide and narrow down the interpretation of the statistical results. In the case of verb clusters I show how the massive amount of variation that is manifested in the raw data can be largely whittled down to the interaction between three grammatical parameters. The method thus allows for a way to separate the signal (i.e. that part of the variation that is due to grammar proper) from the noise (all extra-grammatical factors, ranging from sociolinguistic variation all the way to simple speech errors). The talk thus fits into a broader research program, the goal of which is to narrow the gap between these two approaches to linguistics.