Python GUI 051 – Transpo – transpo_utils.py

I’ve finished writing up the transposition cipher solvers, and now it’s time to finish off the rest of the Transpo code. The problem with documenting the utils files is that they’re almost always works in flux, meaning that if I do write about them as I’m going along, I’ll inevitably have to make changes or additions that will require revisiting those files at a later date. Well, for buttons_utils.py and transpo_utils.py, later is now.

Let’s start with transpo_utils.py. The main changes are with the data record object, and how it initializes specialized cipher types such as Sequence Transposition, plus little tweaks.

One of the tweaks was a modification of key_to_str(). Originally, this function returned digit values in the key list as numbers. This is a problem if the key width is 11 or greater, so I added an if-statement to cope with hexadecimals.

####################
# transpo_utils.py #
####################

Classes in this file:

TranspoSolverClass()

Members:

con_type .. - String name for current cipher
key ....... - Either initialized, or last key value tested
done ...... - Testing status, finished or still going
attack_type - Bruteforce, hillclimbing, wordlist, etc.
solutions . - Solutions with high ngram counts
msg ....... - The cipher message as a list
hint ...... - Provided crib text, if any
width_min . - Starting width for rectangular types
width_max . - Stopping width for rectangular types
option .... - Used for cipher types with multiple options
option2 ... - Used for Nihilist Transposition
sequence .. - Pre-calculated sequence for Sequence Transposition
hill ...... - Hillclimbing min. limit, loop count, etc.
cnt ....... - n-gram count data (count max, top offset)

Methods:

init(): .................. General initialization
init_cadenus_bruteforce(): Cadenus-specific init
init_sequence(): ......... Sequence-specific init
format_settings(): ....... Return settings as formatted string
get(): ................... Return most common settings
get_option2(): ........... Return option2 value

Functions in this file:

make_key_from_string(): Convert any key format to a list
check_len(): .......... Return the length of the cipher text
get_factors(): ........ Get factors of message length
defactor(): ........... Take pairs of factors and separate them
ravel(): .............. Encrypt message based on position data
unravel(): ............ Decrypt message based on position data
key_to_str(): ......... Convert key list to a string
suggested_widths(): ... Suggest rectangle widths based on message

....................... length
weight_widths(): ...... Suggest width for Columnar transpositions

# Imports

from crypt_utils import string_cleaner, sort_second
from math import sqrt, floor, ceil
from inc_utils import make_key

# Initialization.

# Constant used for testing a range of factors for
# columnar transpositions.

MIN_FACTOR_WIDTH = 3

# Definition for the main data messaging class.

class TranspoSolverClass():
... cipher_type = ''
... key = []
... done = False
... attack_type = 0
... solutions = []
... msg = ''
... hint = ''
... width_min = 3
... width_max = 10
... option = 0
... sequence = []
... hill = [11, 5000, 500]
... cnt = [0, 5]

# init()
#
# Object initialization, the starting values come from
# transpo_gui, which makes the call just before beginning
# the auto-solve process.

... def init(self, entries, con_type, hillsettings, countsettings):
....... msg_raw = entries['cipher_area'].get(1.0, 'end').strip()
....... hint_raw = entries['hint'].get().upper()

....... self.con_type = con_type
....... self.msg = string_cleaner.clean(msg_raw)
....... self.hint = string_cleaner.clean(hint_raw)
....... self.width_min = entries['min_val'].get()
....... self.width_max = entries['max_val'].get()
....... self.option = entries['option'].get()
....... self.option2 = entries['option2'].get()
....... self.hill = hillsettings
....... self.attack_type = 0
....... self.cnt = [countsettings[0], countsettings[1]]
....... self.ret_msg = ''

....... self.done = False
....... self.solutions = []

# Route, Fixed, Myszkowski and Cadenus don't use the
# standard keys with non-repeating digits. For everything
# else, initialize the key as ascending digits for the
# given minimum width of the key.

....... if self.con_type not in 'ROUTE FIXED MYSZKOWSKI CADENUS':
........... self.key = make_key(self.width_min)

# For Cadenus and Myszkowski, if we're doing a wordlist attack,
# just set the key to 0.

....... elif self.con_type in 'CADENUS MYSZKOWSKI':
........... self.key = 0

# For Complete and Incomplete Columnar, Nihilist and AMSCO,
# check the hillclimber default settings. If the test key
# width equals or exceeds the minimum hillclimber limit,
# set the key to 0 (for loop counting), turn the solutions
# var to a dict object, and set the attack_type to 1.

....... if self.con_type in 'INCOMPLETE NIHILIST AMSCO':
........... if self.width_min >= self.hill[0]:
............... self.solutions = {}
............... self.key = 0
............... self.attack_type = 1

# init_cadenus_bruteforce()
#
# Cadenus-specific initialization for bruteforce attacks
# on the key (running from [0,0,0,0] to [24,24,24,24]).
# countsettings sets the minimum starting n-gram count, plus

# a “leeway” offset for “top m” counts.

... def init_cadenus_bruteforce(self, entries, countsettings):
....... self.attack_type = 1
....... self.cnt = [countsettings[0], countsettings[1]]
....... self.ret_msg = ''

....... self.done = False
....... self.solutions = []

....... if len(entries['key'].get().strip()) == 0:

# GUI key field is blank. Make the key all 0s.

........... self.key = [0] * self.width_min
....... else:

# Use whatever's in the key field as a starting point
# to pick up where we left off, if desired.

........... self.key = []
........... for ch in entries['key'].get().upper():
............... self.key.append('AZYXVUTSRQPONMLKJIHGFEDCB'.find(ch))

# init_sequence()
#
# Sequence Transposition-specific initialization.
# Clean up the cipher text, extract the primer and check digit,
# and turn the primer into the sequence for the message length.

... def init_sequence(self, msg):
....... self.ret_msg = ''
....... msg_short = msg.replace('.', '').replace(' ', \

................................................ '').replace('\n', '')
....... primer = msg_short[0:5]
....... check = int(msg_short[len(msg_short) - 1])

# Keep just the cipher text portion of the message.

....... msg_short = msg_short[5: len(msg_short) - 1]

# Make the sequence.

....... seq = []
....... for p in primer:
........... seq.append(int(p))
....... ptr = 0
....... while len(seq) < len(msg_short):
........... seq.append((seq[ptr] + seq[ptr+1]) % 10)
........... ptr += 1

# Error check for check digit equals last sequence digit.
# If yes, save sequence to self.sequence member.

....... if seq[len(seq) - 1] != check:
........... self.ret_msg = "Checksum Error: %s doesn't match %s" \

............................ % (check, seq[len(seq) - 1])
....... else:
........... self.sequence = seq
........... print('sequence: %s' % self.sequence)

# format_settings()
#
# Put the object values into a format for pretty
# printing, used for testing.

... def format_settings(self):
....... ret = []
....... ret.append('Con Type: %s' % self.con_type)
....... ret.append('Key: %s' % self.key)
....... ret.append('Attack type: %s' % self.attack_type)
....... ret.append('Hint: %s' % self.hint)
....... ret.append('Option: %s' % self.option)
....... ret.append('Option2: %s' % self.option2)
....... ret.append('Hill: %s' % self.hill)
....... ret.append('Done: %s' % self.done)
....... ret.append('Width Min: %s' % self.width_min)
....... ret.append('Width Max: %s' % self.width_max)
....... ret.append('Cnt settings: %s' % self.cnt)
....... ret.append('Return Msg: %s' % self.ret_msg)
....... ret.append('Message: %s' % self.msg)
....... ret.append('Sequence: %s' % self.sequence)
....... ret.append('Solutions: %s' % self.solutions)

....... return '\n'.join(ret)

# get()
#
# Return the variable data to the auto-solver for
# initialization prior to doing the solve run for
# that time slice.

... def get(self):
....... return self.msg, self.key, self.hint, \

...............self.attack_type, self.option, \
.............. self.width_min, self.width_max, self.cnt

# get_option2()
#
# Imitating get(), just for returning option2, which is
# used only for Nihilist Transposition.

... def get_option2(self):
....... return self.option2

"""
Finally done with the TranspoSolverClass definitions.
Create the data record object as a global.
"""

transpo_solver_data = TranspoSolverClass()

"""
make_key_from_string()

This turned into kind of an all-purpose key reformatting operation. The user can enter a key in the GUI in one of a few different forms:
12345 - Simple digits, no spaces. Allows for 0-9.
1 2 3 - Simple digits, w/ spaces. Allows for 0-infinity.
ABCDE - Simple letters. Allows for a keyword for conversion.

In all cases, take the key and put it in a list of unique integers.
"""

def make_key_from_string(s):

... if len(s.strip()) == 0:
....... return ''

# Test if the key contains spaces.

... ret = []
... offset = 0
... temp = s.split(' ')

# If it does, assume that it's numeric. If there's
# no 0, assume that it starts numbering at 1, set
# the offset to 1, and just append the values in the
# key to a list as ints.

... if len(temp) > 1:
....... if '0' not in s: offset = 1
....... for e in temp:
........... ret.append(int(e) - offset)
....... return ret

# We have a key that's not space-delimited. Test if it's
# alphabetic.

... elif s[0].isalpha():
....... ret = [0] * len(s)
....... cntr = 0

# It is. Remove any characters that aren't in the legal
# list of letters for the current language, and build up
# the key, assigning unique values to repeating letters.

....... for a in string_cleaner.letters:
........... if a in s:
............... for ptr in range(len(s)):
................... if a == s[ptr]:
....................... ret[ptr] = cntr
....................... cntr += 1
....... return ret

# We have a numeric key. Make sure numbering starts at
# 0 (as above) and convert the key characters to ints.

... else:
....... if '0' not in s: offset = 1
....... for e in s:
........... ret.append(int(e) - offset)
....... return ret

"""
check_len()

Some of the cipher types, like Cadenus and Nihilist, have to be particular lengths. For Cadenus, it's a factor of 25, and some of the others need to be square or rectangular. The GUI will have a "Check Length" button which will display the current plaintext length in the Message field at the top of the screen.
"""

def check_len(entries):
... msg = string_cleaner.clean(entries['in_area'].get(1.0, 'end'))

... entries['message'].set('Message length: %s' % (len(msg)))

"""
get_factors()

A couple cipher types, like Complete Columnar, need to be complete rectangles. For auto-solving, when the user is entering the starting and stopping key widths, determine the factors of the message length in advance and return them as matched pairs for display in the Message window. That is, if the message length is 36, return the pairs as [3, 12], [4, 9], [6, 6]. Start at the value in MIN_FACTOR_WIDTH, and only display as "width smaller than height."
"""

def get_factors(lenval):

... results = []
... sr = int(sqrt(lenval)) + 1
... for n in range(MIN_FACTOR_WIDTH, sr):
....... if lenval % n == 0:
........... results.append([n, int(lenval/n)])

... return results

"""
defactor()

As mentioned above, get_factors() returns the factors of the message length as matched pairs. If we need to use the pairs for calculating the best widths for Columnar types, return them as a flat list.
"""

def defactor(factor_pairs):
... ret = []
... for pair in factor_pairs:
....... for e in pair:
........... ret.append(e)

... return ret

"""
ravel()

As described in the transpo introduction, transposition ciphers encrypt the positions of the plaintext message. Therefore, if we number the positions and perform the encryption step, both encryption and decryption become virtually identical. This is a process I call "raveling" and "unraveling."

ravel() - Take the message characters and put them in encrypted position order. Return as an enciphered string.
unravel() - Take the encrypted position order and put the cipher message characters back into their original order. Return as a plaintext string.
"""

def ravel(msg, order):
... ret = [''] * len(msg)

... for i in range(len(msg)):

....... ret[order.index(i)] = msg[i]

... return ''.join(ret)

"""
unravel()

Take the encrypted position order and put the cipher message characters back into their original order. Return as a plaintext string.
"""

def unravel(msg, order):
... ret = [''] * len(msg)

... for i in range(len(msg)):

....... ret[i] = msg[order.index(i)]

... return ''.join(ret)

"""
key_to_str()

Take a numeric list and return it as a string.
Unless the caller specifies a starting value of 0, offset all of the digits by 1 as a default.
If a value exceeds 9, add it to 55 to get hex values running from 'A' to 'F'.
"""

def key_to_str(key, offset=1):
... ret = []
... for i in key:
....... x = i + offset

....... if x < 10:
# Treat as a normal integer.
........... ret.append(str(x))

....... else:
# Treat as hex.
........... ret.append(chr(x + 55))
... return ''.join(ret)

"""
suggested_widths()

The American Cryptogram Association (ACA) has suggested widths for each of their transposition cipher types. I'm using these suggestions along with the message length to tell the user what the more probable minimum and maximum test key widths should be.
"""

def suggested_widths(con_name, msg, msglen):

# amin, amax: ACA suggested values
# skip_calc: Skip the calculation if there can only be 1 value
# retmsg: Return the results as a string.

... amin, amax = -1, -1
... skip_calc = True
... retmsg = ''
... con_list = con_name.split(' ')
... con_type = con_list[0]

... if con_type == 'AMSCO':
....... amin, amax, skip_calc = 8, 12, False

# Cadenus length needs to be a factor of 25.

... elif con_type == 'CADENUS':
....... if msglen % 25 == 0:
........... amin, amax = int(msglen/25), int(msglen/25)
....... else:
........... retmsg = \

..............'Message length (%s) not a factor of 25, between 4-6' \
..............% (msglen)

# For Complete and Incomplete Columnar, use the statistical
# distribution of vowels per row to suggest the best
# starting width to test with.

... elif con_type == 'COMPLETE':

# Start as normal.

....... amin, amax = floor(msglen / 15), ceil(msglen / 8)

# Get the suggested widths in descending order of probability

....... weighted_order = weight_widths(msg, msglen, amin, amax)

# Get the factors of a complete rectangle

....... factors = defactor(get_factors(msglen))
....... retmsg = ['Weighted factors of %s:' % (msglen)]

# Use only those widths in the ACA suggested range.

....... for e in weighted_order:
........... if e in factors:
............... retmsg.append(str(e))

....... retmsg = ' '.join(retmsg)

... elif con_type == 'INCOMPLETE':

....... amin, amax = floor(msglen / 18), ceil(msglen / 15)

....... retmsg = []
....... weighted_order = weight_widths(msg, msglen, amin, amax)
....... for e in weighted_order:
........... retmsg.append(str(e))
....... retmsg = 'Weighted order of probability: ' + \

................ (', '.join(retmsg))

... elif con_type == 'MYSZKOWSKI':
....... amin, amax, skip_calc = 12, 15, False

# Nihilist needs to be in a square.

... elif con_type == 'NIHILIST':
....... if sqrt(msglen) != int(sqrt(msglen)):
........... retmsg = 'Message length (%s) not a square' % (msglen)
....... else:
........... amin, amax = int(sqrt(msglen)), int(sqrt(msglen))
........... if amin > 10:
.............. retmsg = \

................'WARNING: ACA guidelines say a max of 10x10 (is %s)' \
............... % (amin)

# Railfence and Redefence both use a range between 3 and 7 rails.

... elif con_type == 'RAILFENCE':
....... amin, amax = 3, 7

... elif con_type == 'REDEFENCE':
....... amin, amax = 3, 7

# The Sequence key has to be 10 letters long.

... elif con_type == 'SEQUENCE':
....... amin, amax = 10, 10

# Route needs to be a complete rectangle.

... elif con_type == 'ROUTE':
....... factors = defactor(get_factors(msglen))
....... f = []
....... for e in factors:
........... f.append(str(e))
....... retmsg = 'Weighted factors of %s: %s' % (msglen, ', '.join(f))

# If skip_calc is False, calculate the suggested key
# widths from amax and amin.

... if not skip_calc:
....... amin, amax = floor(msglen / amax), ceil(msglen / amin)

... return amin, amax, retmsg

"""
weight_widths()

Calculate the most probable key width for Columnar transpositions based on the distribution of vowels in each row of the rectangle, given that AEIOUY represents 40% of the message text, on average.
As given by William Friedman in his MILCRYPT textbooks.
"""

def weight_widths(msg, msglen, testmin, testmax):

... devs_list = []

# Test each width in the range.

... for width in range(testmin, testmax + 1):

# Calculate the size of a complete rectangle, plus
# the residue.

....... lengthnorm = int(msglen/width)
....... leftover = msglen - lengthnorm * width

....... offsets = [lengthnorm] * width
....... for i in range(leftover):
........... offsets[i] += 1

# Build the rectangle and populate it with the message.

....... box = []
....... ptr = 0
....... for i in range(width):
........... box.append(msg[ptr: ptr + offsets[i]])
........... ptr += offsets[i]

# Set up the deviation calculations.

....... devs = []
....... expected = width * 0.4
....... total_dev = 0

# Go through each line and get the deviation of counted
# number of vowels from the expected.

....... for i in range(lengthnorm):
........... vowelcnt = 0
........... s = ''
........... for j in range(width):
............... ch = box[j][i]
............... s += ch
............... if ch in 'AEIOUY':
................... vowelcnt += 1

# Add to the total deviations for the rectangle.

........... total_dev += abs(expected - vowelcnt)
........... devs.append(abs(expected - vowelcnt))

# Calculate the statistics for the current width.

....... sumdevs = sum(devs)
....... avgsumdevs = sumdevs/lengthnorm
....... sqravgdevs = avgsumdevs ** 2
....... sqrdevs = expected/sqravgdevs

# Add to the list of deviation per width.

....... devs_list.append([width, sqrdevs])

# Descending sort on size of squared deviation.

... devs_list.sort(key=sort_second, reverse=True)
... ret = []

# Just return the widths, with best suggestion
# at the top.

... for elem in devs_list:
....... ret.append(elem[0])

... return ret

Next up: button_utils.py.

Published by The Chief

Who wants to know?

Leave a comment

Design a site like this with WordPress.com
Get started