# Anatomy of a DOOM Level: Part 1 - WAD File Structure Lets take a deep dive into the structure, anatomy and loading of a DOOM WAD file and its lump data. DOOM levels are commonly known as WAD files. In fact a single WAD file contains all of the levels and data for an entire classic DOOM or DOOM II campaign, linked together in the form of episodes. The DOOM engine and the WAD file format was licensed by other studios like Ravensoft for games like Hexen and Strife. Make a Doom Engine in Python UpdateVideo Thumbnail for Make a Doom Engine in Python Update WAD stands for "where's the data?", as it was coined by the DOOM creators back in the early 90s. But I won't be spending too much doing a history on DOOM so much, as that exists in better substance elsewhere. This is a focus on DOOM level building and two major parts: 1. The format and structure of a WAD file 2. Creating a DOOM level using a modern creation tool ## Helpful References The best efforts I have found understanding the format of a WAD file are not so much the WIKI and documentation found online, but instead some of the original documentation from that era and the source code of the DOOM engine itself. I found the efforts of this project [DIY Doom](https://github.com/amroibrahim/DIYDoom) and it's reverse engineering of the DOOM engine to shed light on the WAD structure. For a deep dive I suggest you go there. It was the basis for my own recreation of the [DOOM engine in Python](https://github.com/jordansavant/doomengine.python) as a pet project. I will reference this as "DIY". Additionally [The Unofficial DOOM Specs](/images/Unofficial_DOOM_Specs.txt) file authored by Matthew S. Fell in 1994 provides the most comprehensive detail I can find, written in a classic text document, regarding the structure of a WAD file. I will reference this as "UDS". ## Types of WAD Files There are two main types of WAD files: an IWAD and a PWAD. IWAD stands for "internal WAD" and contains all of the main game data. A PWAD stands for "patch WAD" and can contain data that overwrites the IWAD content, much like a mod or patch. Somewhat confusingly both WAD files are traditionally ended in the `.WAD` extension, such as `DOOM.WAD` as an IWAD and `HEXADEATHIMAL.WAD` as a PWAD. For the second part of this page we will be creating a PWAD level which will be overwriting the `MAP01` level of the main IWAD file of DOOM II. ## WAD Structure A WAD file is divided up into "lumps", each containing contiguous information about the game or level part. The primary IWAD has the following lumps: ```txt LumpName Description -------- ----------- PLAYPAL fourteen 256 color palettes COLORMAP maps colors in the palette down to darker ones ENDOOM text message displayed when you exit to DOS DEMOx x=1-3, are the demos ExMy subsequent entries define a single level's data MAPxy like ExMy, but for DOOM 2 TEXTURE1 list of wall texture names and their composition data, used in the SIDEDEF portion of each level TEXTURE2 more wall texture compositions PNAMES lists all lump names used as wall patches GENMIDI General Midi standard instrument data DMXGUS Gravis Ultra Sound instrument patches D_ExMy music for a doom 1 level D_INTER music played on the summary screen between levels D_INTRO music played when the game starts D_INTROA more introductory music D_VICTOR music played on the victory text-screen after an episode D_BUNNY music for while a certain rabbit has his story told D_* music for a doom 2 level ... following lumps contain menu data, strings and pictures ``` _excerpt from UDS_ Our custom level we will be building mostly concerns itself with the `ExMy` and `MAPxy` sections since we will be creating a level that overwrites MAP01 of DOOM II and not overwriting other portions of the game such as music, sprites or menus. Each level consists of the map directory entry (such as E1M1) and 10 lumps itself: ```txt LumpName Description -------- ----------- THINGS monsters, weapons, keys, etc LINEDEFS line data from on vertex to another SIDEDEFS defines what to draw on the side of a line VERTEXES start and end points for lines and segs SEGS subdivided lines created by the Binary Space Partitioner SSECTORS subdivided convex SECTOR polygons created by the BSP NODES branches of the BSP tree SECTORS partitioned areas of the map, such as rooms, stairs etc REJECT ai rules for monsters in sectors BLOCKMAP map subdivision lists of lines for collision detection ``` A level PWAD can contain any amount of data to overwrite in the original IWAD, it does not have to be complete. We could write a PWAD for E1M1 on DOOM with just new THINGS data. ## Data Layout ```txt # BIG-ENDIAN format <---- 32 bits ----> /------------------\ ---> 0x00 | ASCII WAD Type | 0X03 | |------------------| Header -| 0x04 | # of directories | 0x07 | |------------------| ---> 0x08 | directory offset | 0x0B -- ---> |------------------| <-- | | 0x0C | Lump Data | | | | |------------------| | | Lumps - | | . | | | | | . | | | | | . | | | ---> | . | | | ---> |------------------| <--|--- | | Lump offset | | | |------------------| | Directory -| | directory offset | --- List | |------------------| | | Lump Name | | |------------------| | | . | | | . | | | . | ---> \------------------/ ``` _except from DIY_ The above diagram shows the general layout of a WAD file with its byte offsets and contents. The Header section contains data: * WAD Type indicates if it is an IWAD or PWAD * Number of directories is an integer representing the number directories contained in the WAD * Then lumps, broken up based on their directory positioning The Directory allows us to navigate within the WAD file to any directory we care to load. These are lumps such as THINGS, VERTEXES etc. Each directory entry in the header defines: * the Lump Name * the Lump Byte Position within the WAD * the Lump Byte Size to load Considering that a Map is just a sub section of a WAD file it makes more sense to define an Object representing a DOOM Map and its structures through a class. This class will have all of the corresponding lumps within it defined, along with metrics around their binary sizes, positions and data types. It also can contain the logic for linking data together such as vertices to their parent linedefs and functionality for traversing the BSP Tree. ### The BSP Tree The BSP Tree that is generated from the Binary Space Partition algorithm is done prior to the WADs creation and the data is stored in the WAD file as a set of tree Nodes. This data structure is very important for the rendering of DOOM walls in the game and calculating wall positions in relation to the player position. I will not be doing a deep dive on the BSP tree but it can be found in the algorithm section on [Binary Space Partitioning](/book/algorithms/binary_space_partitioning.md). ## The MAP Class The Map class contains all of the data of a Map lump that is loaded. It defines the static layout of the lumps within it and has lists of all data that will be loaded. Additional functions around linking data and navigating the BSP tree will be appended later. ```python class Map(object): # used to identify if a node id has the sector bit on the end # 0x8000 in binary 1000000000000000 # 0x8000 in decimal 32768 SUBSECTORIDENTIFIER = 0x8000 class Indices: NAME = 0 THINGS = 1 LINEDEFS = 2 SIDEDEFS = 3 VERTEXES = 4 SEGS = 5 SSECTORS = 6 NODES = 7 SECTORS = 8 REJECT = 9 BLOCKMAP = 10 COUNT = 11 # enum helper for looping def __init__(self): # WAD Data self.name = "" self.vertices = [] self.linedefs = [] self.things = [] self.nodes = [] self.subsectors = [] self.segs = [] self.sectors = [] self.sidedefs = [] ``` As you can see the `Indices` are listed in the same order as the lumps defined in our UDS guide. ## The WAD Class If a Map is a class that represents a WAD's map lump, we should also define a WAD class that has the capability of loading a WAD file and populating our `Map` object data. This `Wad` class: ```python class WAD(object): def __init__(self, wadpath): self.wadpath = wadpath; self.f = open(self.wadpath, 'rb') # read-binary self.loadHeader() self.loadDirs() def loadHeader(self): # The header has a total of 12 bytes (0x00 to 0x0b) # this 12-bytes is divided to 3 groups # first 4 bytes is the WAD type as CHAR # second 4 is count of directories as Int # third 4 is Int offset of directories self.type = self.load_string(0, 4) # char[] self.dircount = self.load_uint32(4) # uint32 self.diroffset = self.load_uint32(8) # unit32 def loadDirs(self): self.dirs = [] self.dirMap = {} for i in range(0, self.dircount): offset = self.diroffset + 16 * i # get dir info directory = Directory() directory.lumpOffset = self.load_uint32(offset) directory.lumpSize = self.load_uint32(offset + 4) directory.lumpName = self.load_string(offset + 8, 8) self.dirs.append(directory) # keep hashmap of directory name to its index self.dirMap[directory.lumpName] = len(self.dirs) - 1 ``` The constructor loads the wad file from disk then loads the two main definitions from its start: the Header and the Directory listing. Loading the header itself is relatively simple if we follow the conventions defined for sizes and it provides data about our directory portion of our header. The Directory loading operation organizes a list of Directory objects containing the lump position, name and offset of all of the data of our WAD file. We then keep a handy hashmap of the lump name and its position in our directory list for lookup. The `Directory` class ```python class Directory(object): def __init__(self): self.lumpOffset = 0 # uint32 self.lumpSize = 0 # uint32 self.lumpName = '' # char[8] ``` Continuing on with our WAD class we need to be able to load a map into our `Map` class and all of its subsquent parts. We can do so with these new `Wad` class methods: ```python # ... continuation of Wad class def loadMap(self, mapName): map = Map() map.name = mapName if self.loadMapData(map): return map return None def loadMapData(self, map): mapIndex = self.findMapIndex(map) if mapIndex == -1: return False # load map data if self.readMapDataList(map, mapIndex + Map.Indices.THINGS, "THINGS", Thing.sizeof(), self.readThingData, map.things) is False: print("ERROR: Failed to load map things " + map.name) return False # ... skipped other map data ... # run some helpers to define the map map.createData() return True def findMapIndex(self, map): if map.name in self.dirMap: return self.dirMap[map.name] # get index return -1 ``` The method `loadMap` will create our `Map` instance and call `loadMapData`. `loadMapData` then uses the Directory map to find the `Directory` instance based on the name of our Map lump such as "E1M1" or "MAPO3" etc. Once we have found our directory we can then specifically load map data into the map object off its sub-lumps for THINGS, LINEDEFS etc. This data loading is facilitated by `readMapDataList` which is defined as: ```python # ... continuation of Wad class # LIST LOADER # Takes a pointer to a list location in # the WAD and loads the data into the # relevant map list def readMapDataList(self, map, indexOffset, lumpName, byteSize, reader, mapList): directory = self.dirs[indexOffset] if directory.lumpName != lumpName: return False count = int(directory.lumpSize / byteSize) for i in range(0, count): item = reader(directory.lumpOffset + i * byteSize) mapList.append(item) ``` This method takes the Map's offset in the WAD plus the map's lump data offset to seek to the position of the WAD to load data from. It also takes a `byteSize` which is the size of each object in the lump, and a function pointer called `reader` which knows how to load the binary data into a Map object such as a thing. It also takes a list to append this data to. We will be looking closer at this next with loading our `THINGS` object. ## Loading THINGS A `THING` is comprised of the following attributes: * x position in map as an `int16` * y position in map as an `int16` * angle in map as an `int16` * type as an `int16` * flags as an `int16` These 5 attributes are each 16 bit integers and thus each `THING` is actually 10 bytes. Recall that our Directory records include both the lump position and byte size, so we can calculate the number of THINGS within the lump with: directory thing lump byte size / thing byte size ```python class Thing(object): # Types https://doomwiki.org/wiki/Thing_types class Types: # Other O_PLAYER1 = 1 # ... M_CACODEMON = 3005 M_LOSTSOUL = 3006 # ... def __init__(self): # WAD DATA self.x = 0 # int16 self.y = 0 # int16 self.angle = 0 # uint16 self.type = 0 # uint16 self.flags = 0 # uint16 def sizeof(): return 10 ``` The important parts of this class are the properties and the byte sizeof. THINGS are special in that they also have flag and type data and I have deleted a substantially large list of type integers. Our `Wad` class references this specifically when it loads the map data for THINGS with the following code ```python # ... from Wad class loadMapData function if self.readMapDataList(map, mapIndex + Map.Indices.THINGS, "THINGS", Thing.sizeof(), self.readThingData, map.things) is False: print("ERROR: Failed to load map things " + map.name) return False ``` In here we are specifying bothe the map offset and THINGS offset within it, then the size of bytes for a single THING and a reader function for how to load this data from the WAD, and finally the map's things list to append into. Let's take a look at our reader function for THING data: ```python # ... continuation of Wad class def readThingData(self, offset): t = Thing() t.x = self.load_sshort(offset) t.y = self.load_sshort(offset + 2) t.angle = self.load_ushort(offset + 4) t.type = self.load_ushort(offset + 6) t.flags = self.load_ushort(offset + 8) return t ``` In a relatively simple fashion we load the THING data into a `Thing` object based on the data layout of its properties. The `load_*` methods were actually a little tricky to define in Python and its been some time since I wrote them so I am a bit rusty on how they are working but they are as follows for reference: ```python # ... continuation of Wad class # DATA TYPE LOADERS # Gets raw byte data from WAD # in expected format def load_string(self, offset, length, preserveNull = False): self.f.seek(offset) sss = '' for i in range(0, length): c = struct.unpack('