Does That Make it Scolfish?

Saturday, March 23rd, 2019

I’ve been messing around with predictive text and Markov chains for my campaign, to come up with a quick name list reference to use with one-off NPCs. The players decided, in a fit of whimsy, that the king of the elves was named Scott. Scott Eddington. Fine, I says to myself, the elves are Scottish. Ach. We have Scottish elves.

However, we’d already thrown around a bunch of elf-ish sounding names for other elves and locations in their kingdom, so I couldn’t just borrow straight from a Scottish name list. Hence my playing with language dissection tools — I needed to make the choices work with each other.

A couple decades ago, I used to be really interested in designing conlangs — constructed languages — and still had a few tools sitting on my drive, and some remaining memory of how languages are functionally constructed. This was something I’d done multiple times in the past for fun, as well as for writing fiction and use in fantasy gaming; for example, taking apart Sumerian to populate a pseudo-Sumerian lexicon, or combining Finnish with Celtic to produce a new language.

I didn’t quite need to dig quite that deep into my dusty toolset, though, or give myself a refresher course on the subject. I just needed a simple list of names that would sound right, and I already had the tools to do so.

Or so I thought.

I discovered a number of those tools no longer function on a 64-bit OS, and their programmers have long-since disappeared into the mists of the internet, so getting an update was no longer feasible. Nor was installing a virtual machine running Windows XP just to do this.

So I went with what I had: Textpad, Excel, and the Internet.

The first thing I needed was lists of names: Scottish and Elvish. Both were reasonably easy to find. Neither, however, were available as a plaintext list, being tied to meanings, and gender, and historical information, and so forth. With enough knowledge of macros and RegEx, or skill at Googling, you can use Excel and TextPad to quickly strip the extraneous information from a block of text and pare it down to just the pieces you want. (Though some lists are easier to pare down than others.)

That ease of availability also gave me choices. There were long lists of modern Scottish names, which I suspected were just “names people in Scotland are giving their babies right now”, rather than a list of traditionally Scottish (and Scottish-sounding) names — which is what I really wanted — as well as a decent-sized list of Scottish-Gaelic names. The latter was something of a jackpot, as these already sound vaguely similar to a Scots-influenced elvish. (They are also, however, insufferably unreadable and unpronounceable to Americans unless carefully transliterated into modern English.)

This did, however, give me something to shoot for in terms of sound.

I eventually went with a shorter list of modernized, traditional Scottish names.

However, the first list of Elvish names I used ended up not being elvish-sounding enough. There were tons of names on it, but only a handful had the precise Tolkienian feel I wanted for the next step. So I created an Excel spreadsheet that randomly combined elvish prefixes and suffixes with core elven words to produce what I hoped might work as elvish-sounding names.

The results were hit-or-miss due to the simplistic randomization algorithm and the content of the core word list. While I poked around for a shorter list of elvish names taken directly from Tolkien’s work, nothing I found was written in a way I wouldn’t spend much more time than I already had paring down the information. I ended up using a quickly curated selection from the randomized combinations.

Note: I could have gone deeper into this project and determined the common consonant, vowel, and CV clusters for each language, then written a script to randomly combine those clusters in the most common patterns and lengths for the desired child language. I almost did, as I have (or rather, had) a couple tools specifically made to do so — the one I had written myself is surprisingly one of the programs that still functions on a modern OS — but doing so would also have proved much more time-intensive than I desired for this particular project. A Markov chain generator could fake this closely enough that I didn’t mind skipping this step.

Armed with my two lists of names, it was time mush the lists together: I put both lists into one column, one name per line, then used Excel to randomly mix the lists together[1] (easily achieved with Excel’s RAND function and sorting), then pasted the now randomized list into Textpad and used RegEx to turn the column into a block of text (a “corpus”) for the Markov generator (the one I used required the input to be a block). The generator then parsed the corpus to discover the most common letter combinations, prefixes, and suffixes, and combined those results into new words following the rules it deduced from the corpus’ content.

Finally, I checked the result.

The first few tries weren’t quite correct.

There were a significant number of two-letter names, excessively long names, as well as too many instances of names that simply didn’t sound right, or were completely unpronounceable. I had to play a bit with the content of the corpus and the rules the generator was using to parse it until the lists it produced began to look and sound right — results that were neither too oddly random, nor just reproductions of names in the parent lists, both of which are possibilities if you don’t have your Markov rules-set just right.

In the end, I eventually threw a handful of Scottish-Gaelic names into the corpus, which helped produce better sounding (and looking!) combinations.

While the generator still produced the odd two-letter name, as well as a couple really long names (which is fine, because Scottish and Elvish do that at times, and it will make my players’ eyes cross to deal with a twenty-letter place name), I had a decent-sized list of “Scottish-Elvish names” to draw on during the game, for those times when the players decide they really want to know the name of Random City Guard or NPC Farmer, or Unremarkable Village, or even the answer to “What Do Elves Call Squirrels?”


[1] Thinking about it later, it is possible I incorrectly presumed the randomized combination would help the generator mix the two languages together more easily — increasing the likelihood it would borrow from both when producing new names — so this step may have been unnecessary.


4 Responses to “Does That Make it Scolfish?”

  1. BismuthBorealis says:

    Stumbled upon this blog recently, via the wuxia-name-gen which is my jam and just what I (still and have always) needed, but then digging around a little found the actual blog part not just the wuxia part.

    This too, it turns out, is my jam. I am slightly curious though, about what you used for the markov-ing; I’ve used the Donjon markov gen for names in the past myself, but that’s got the limitations of it being a website and not able for me to fiddle with, and probably hardware/software/internet limits about length or somesuch.
    I also discovered a python script *somewhere* which is slightly nicer to me for limits and such, but it being something I’d not written myself wasn’t the best when it came to getting it to bend to my whims, plus it was one for sentences rather than words.

    Um. Suddenly not sure what else to cram into this comment.
    Ah well. this is a comment and it comments so it’s fine.

  2. greyorm says:

    Sorry, haven’t checked or updated the site in a while! (Been busy actually running my games, and dealing with life.) For this particular bout of name generation, I used a website that allowed me to upload text files filled with the elements I wanted to use as my base…but it appears I have to dig it out of my bookmarks on my other browser as it isn’t popping up for me with a quick Google search. (Sorry, I’ll try to get to that in short order and put it in another comment.)

    Glad you liked the wuxia generator! I agree, it’s awesome (but I would).

  3. greyorm says:

    Found it. I used Chaney, which can be located here: https://chaney.herokuapp.com/

  4. BismuthBorealis says:

    Haha, nothing wrong with inactivity, for example despite commenting a month ago this is actually the first time I looked to see if there was a reply.

    (And I was both glad and amused by my timing in such)

    Thanks for the link, though! I shall maybe (hopefully?) get some use out of it!.


Leave a Reply