
Markov Chains
I thought that it would be a fun idea to play around with Markov chains a bit. If you don't know what a Markov chain is, don't worry - I don't really know either - but here's my vague understanding of it. As far as I know, a Markov chain is a type of program that can look through a database of sentences and generate its own sentence from them. It keeps track of each unique word in the database and notes the words that come after it (for example, "it" could be part of "it is" or "it was not" or "it could be"), determining the likelihood of each sequence of words following one another. When it generates a sentence, it picks a random first word and then chooses the next words to come after it based on how likely they would be to follow the first. There are a bunch of variables to tweak: for example, you can change how much the chain prefers to pick the most likely following word (maximum likeliness will always give you pretty much the same sentence, and minimum likeliness will give you a bunch of random words in a row). You can also change the chain's 'memory' - that is, how many previous words it considers when choosing the next one. I definitely don't understand the mechanics enough to make one from scratch, but luckily Python has a library called markovify that does most of the heavy lifting so I can play around with the fun parts.
You need to supply a database before the chain can do anything, but any large collection of text will do. I tried a couple of different options and each one had a slightly different flavor of results, but my favorite ended up being the original Star Wars script. My first experiments involved using the chain model with all the default settings, which produced some funny results, but I wanted to go weirder. By using some overcomplicated text-swapping code and playing around with markovify's internal methods, I got the program to treat letters as "words" and words as "sentences" - letting it generate text letter-by-letter and invent new words in the process. Since it's still looking for the more likely next options, and consonants are generally followed by vowels and vice versa, a surprising amount of its gibberish is still pronouncable. Sometimes it generates enough real words in a row to make a feasible sentence, and sometimes it almost gets there but throws in a new creation like "attacularge" or "wherweighters." I thought it would be interesting to try getting it to make up new quotes, so I used a bit of code to format the script, adding colons to the dialogue prompts to better distinguish them. By re-rolling any sentence without a colon, I was able to get quotes consistently, which led to some pretty fun results. (The quotes were often from actual characters, but there was the occasional new creation like "DEATH STARKIN" or "BEAM POV PORKIN.") I'll leave you with a couple good sentences from each of the three iterations:
Iteration 1 (words):
Now that you've been around those giant starships you're beginning to like her.
The fearsome Dark Knight takes a defensive position.
Luke smiles and scratches his head from right to left.
Iteration 2 (letters):
Threep fright
Yeah, not reparagmenside office.
Gold Leia runs explosing, is not wasteland.
Stat mom on't wonne com oun to beento fleve way a grime!
Iteration 3 (letters - quotes only):
LUKE: All hoots this ships, a Rebel cockpit.
LEIA: Luke aided space.
SPACE AROUND THREEPIO: proaches.