Skip to main content

Make Modular Grids in Seconds

The World of Invisible Structures

My first work experience as a designer started from the magazine layout. So, in other words, it started with a grid. Those thin red lines in the mid 2000’s InDesign. I remember that curiosity with which I was learning how they work, what kinds they can be and how to properly use them according to any given situation. In a country I was living at a time it was almost impossible to just go out and buy a decent looking magazine or some specific book on a subject. Considering grids, there were couple books circulating around — good reads like Jan Tschichold’s “The Form of The Book”, Josef Müller-Brockmann’s “Grid Systems in Graphic Design” and something else that I can’t quite remember. And of course, as lots of fellow designers I constantly explored the world of dial-up internet in search of articles, rules, books, examples or even related materials. It’s been around 10 years since I made and forgot that magazine layout, but that curious feeling hasn’t left me till this day. I think it even grew bigger.

How it All Started

Around a year ago, during my short visit to Moscow I’v been asked to make a series of prototypes for one of the main Russian opposition media — Novaya Gazeta. I was excited to take the job mainly because of two reasons — I’ve always been fond of media-related projects + the project was tended to be executed with one of the Russia’s most prominent design studio named Charmer, which is responsible for almost all worth-looking russian media redesigns of the past years.

Before proceeding to execution we had a couple talks with Charmer’s art-director and co-founder Alexander Gladkikh regarding different approaches to media redesign, grid systems usage, future of technology and role of media in it. I was inspired by his body of work and systemic approach which obviously made his designs stand out from the rest around.

He showed me some of his grids which were proportion-based according to the type of media he used in his project (f.e. 16x9 for video-rich websites, 3x2 for photos and so on). That was a new approach for me, considering that I was usually constructing my grid systems upon baseline type values — defining vertical rhythm and then composing layout keeping it in mind. So, our talks raised a lot of thinking and explorations, which as well led me to begin a search for an algorithm that would combine media-centric and type-centric approaches into one system through multiple calculations. And the idea was to wrap these calculations into an easy-to-use wizard, that would allow any designer to try it out and luckily, make his own way into calculations in layout design.

Why it’s Important at All

In any design discipline I’m aware of and especially in graphic design, grids are one of those things that make visual and logical impact without being directly seen. At a very abstract point, they firstly build hierarchy and then distribute objects within its logical boundaries. Its like a DNA in a cell, that tells its proteins to organise in a certain way to build a working system. From the layout perspective, any media website is a combination of images and text, spiced up with different kinds of separators, framed in different kinds of boxes, sometimes partly replaced by icons and usually differentiated by colour. Image can contain some action within its dimensions, so we might call it a video, but that distinction doesn’t matter at that point and only will come in hand couple steps later.

So lets start with a text. Obviously, to visualise it we need fonts, which legibility will allow our readers disruptively perceive information that website authors or editors are about to share. Among all text styles that can be presented on your website (titles, headlines, leads, etc) body copy (the one you are reading right now) is the one you read the most. So, when it comes to text legibility, its the right place to start. As we know from earlier studies, legible body of text, be it a book or a website, should have well-adjusted vertical rhytm within it. And when it comes to setting properties for your body copy in intend to make it legible in terms of the vertical rhythm and according to the knowledge you have, most graphic editors (like Sketch, Photoshop or Illustrator) beside the process of choosing the font, which we are about to cover later, will make you deal with two main numbers — font size and line height, which in result will give you a body copy font-size/line-height ratio (lets call it flr for the further use). On the illustration below you can see body copy flr’s of some popular media websites.

Since there are still a lot to tell I will not focus on the process of flr setting and will move to the font selection process. For those of you, who’s encountering this process for the first time I quickly digged some links like this, this and this, but be sure, Google can guide you even further.

When you’ll start applying different fonts to your body copy (using the same flp’s) you will encounter, that (among other visual changes) the vertical rhythm of your body will change, and sometimes will change a lot. You can see it in example below, where I used Helvetica Neue and Baskerville with the same font size/line height values.

This images demonstrates different levels of information that you consume with text. From left to right information amount decreases, from regular text block to pure rhythm demonstration.

That means — you can’t just pick any flr from the website you like, apply it to the typeface of your choice and expect that it would work. It will not, and mostly because different fonts, among other distinctions have different x-height values, like in example with Helvetica Neue (x-height is around 50% of font-height) and Baskerville (x-height is much lower that 50% of font-height). But x-height value (as many other font values) can’t be seen in Sketch or Photoshop, so what you can do is use your eyes, experience and sense of beauty to choose the font, which will form the rhythm that will suite your body copy needs. So, that’s where the first challenge begins.

I was searching for a good read regarding x-height/font-height relations and setups and found one interesting point in Xavier Bertels presentation, where he stated that there is a strong dependency between body copy legibility and x-height/em-box heights relation (the size of lowercase letters comparing to uppercase letters) and as close x-height gets to 50% of em-box height as better for your body copy legibility. He even made a formula for that, which is illustrated beyond, with 16 as a font size and 8 as x-height value within it:

Of course, there are a lot of other factors of font legibility and its appropriateness to be used for body copy, but x-height/font-size relation is definitely a good one to have in mind when choosing fonts for your next project. And it would be quite handy to have the ability to browse your font library according to additional metrics, like this one.

New Adobe Typekit Font Browser with x-height property within its filters

So, once you chose a typeface for your body copy, defined it’s appropriate size and line-height you’ve got a number, which you can now use to build your baseline grid. In my case this number (baseline micro-module) is my body copy’s line-height divided by 2 or 3 in some cases. Xavier suggests to divide it by 3 or 4 in his presentation, but its really up to you as long as you can divide it. So, if my body copy’s line height appears to be 22 pt, my baseline micro-module would be 11 pt. Easy as that. Now I can build something like this:

And then I can set my type on it:

The next step is where it gets a little tricky. We need to draw vertical lines to be our columns and let’s say we need 12 of them for that one. Now we need to specify our column width and values of our gutters (if, of course, we need them). These widths can be defined by specific media proportions, that need to fit within any number of our columns, by typical ad banner widths or by dimensions of any graphical object, that we might want to use in any exact size. If we are talking about most common media proportions I would name 3:2 (commonly used for photos) and 16:9 (commonly used for videos). There are also a lot of other proportions which you can discover around the web or print. What’s interesting in proportions is the fact that they are universal and make sense not only in graphical representation but also for example as sounds.

“3:2, beside being common proportion for shooting photos is known as Perfect Fifth in music, and serves to tune a violin starting from the 14th century”

So, now we need to align our proportion (lets choose 16:9) with our baseline rhythm to form a system. When I faced this problem for the first time I chose to calculate it this way — take all 16:9 blocks that exist in the range from 160/90 px to 1280/720 px and find out which one of them is divisible by 11, meaning, matching my baseline rhythm. If you want to get an idea of how that looked, here’s a shortened version of the process:

I took all my blocks, and because they share the same proportion (16:9) I easily arranged them with each other. Then drew columns based on the smaller block’s width. And, of course, all blocks matched my 8px baseline rhythm. Later I found out that the same logic can be observed in the way overtones are arranged in music.

It appeared to be 8 columns in the first test, but can be as many as you like.

From these little experiment we learned that in the entropy of all possible grid values there are some, that are not only respective to some particular ratios and can form an hierarchical structure of sizes within themselves but also rely on predefined vertical rhythm, that is stored within your body copy. Also these hierarchical systems are easily divisible into columns while using the smallest block width as the width of the column. So, instead of trying to put the elephant in the pre-defined box (in our case, typical grid widths, like 960 or 1024 px that many use by default), we can craft the box according to the elephant metrics and let him in with a breathe. What we need is to apply some math to extract values from our typography settings and use them as filter to pick the right sizes from the sequence of selected ratio. And build the whole system out of it. At that point I thought that it would be good to consult somebody with more background in math and coding. Next part of the article I’m heading to my partner in crime — Nazar Grabovsky, who joined me in that project starting from, well, now.

Getting Involved

by Nazar Grabovsky

I have never worked myself with web design nor typography, but always had an interest in typeface anatomy, how good legibility is achieved and related topics. After Ross approached me with his idea I found it quite interesting and offered help in solving numerical problems he encountered. I have been developing algorithms of various types for some time including for graphics and image processing. And, at first glance, it seemed like rhythm in grids should have a concise and elegant arithmetic solution. But first I had to dive in into typography world. While discussing with Ross and researching idea of rhythm, how it’s obtained and applied, I learned quite a bit about modern typography (as for beginner in this area). You can get an idea of what epistemological effort I had to make by taking a look at my notes.

At first I couldn’t get the idea of rhythm in terms of rectangular building blocks and why would anyone want to constrain their layout design to any numbers. But progressively, while getting better understanding of typography fundamentals, it cleared out. And my main goal became to decompose emerged numerical problem to basic orthogonal elements, so it would become easy to build up a rhythmic grid from these basic elements. As any numerical problem and its solution (an algorithm), it must have some input (given initially) and output (a desired result). For me it started from analyzing inputs & outputs.

Analytical solution

The main number that defines the whole rhythm is the baseline number, which is basically a fraction of line height, usually divided by 2 or 3 (do not confuse with baseline metric — a horizontal line upon which letters are positioned. The baseline number is just a number derived from font metrics values). Based on baseline number we need to obtain size of a micro-block — a largest possible block fitting the rhythm within a single column. Such micro-block allows to build up bigger blocks and the rest of a grid. Non-trivial part comes into play when we consider gutters and it must be a multiple of baseline number by definition. Consequently gutters are also becoming part of bigger blocks and such case introduces a puzzle into computation of optimal rhythm. Without gutters solution is rather straightforward — just integer division omitting any of the remainders. With gutters, the layout width must be taken into consideration (as upper limit) simultaneously with the baseline number (as gutter multiple). Consequently, baseline number vs. gutter size is the main interrelation between vertical and horizontal dimensions of micro-block.

Since the whole problem lies in integer number domain (we cannot split a pixel into smaller pieces), it puts restrictions on arithmetics that we can apply, specifically it means integer arithmetics only (division without remainders). But after some joggling with modular arithmetic, a solution was found based on Least Common Multiple. To keep it complete and slightly formal I have summed up micro-block arithmetic definition in the following concise formula:

But do not get discouraged if you are not familiar with such notation. All it means that micro-block height must be a multiple of baseline number and at the same time grid width must be divisible by micro-block width taking gutter size into consideration. If you are curious for details, you can check more explicit derivation and feel free to drop a comment below if clarifications needed.

Rhythmic grid generator

The first version of grid generator tool based on solution described above was implemented in Octave  —  a handy open-source environment for math manipulations. This tool generates grid configuration and a plain image of corresponding layout only. But then we were able to generate hundreds of rhythmic grid layouts in minutes and inspect them in order to validate and ensure correctness of derived formula & algorithm.

A small sample of generated grids (27 out of ~1000)

One interesting fact came out while revising dozens of generated layouts. Some block ratios are more flexible than the others. Since there is rather limited amount of proportions used in media, we are considering only 1:1, 3:2 and 16:9 ratios. For 1:1 you can get rhythmic blocks of any gutter size basically. But for 16:9 its completely the opposite — you can use only zero-gutter to get desired block proportions. Lastly, 3:2 is in the middle, it has fair amount of rhythms, but not as much as 1:1. Mainly it’s because of number 9, since odd numbers are not that friendly to divisibility as numbers 1 and 2. And there is not much you can do about it, if you want to keep it strictly precise with ratios.

1:1 aspect ratio is flexible with different gutters. 16:9 does not have any of rhythmic gutters.

The next step was to make a web app. And that’s where another fun starts — precise typography in browser. The main two challenges we faced were pixel-precise positioning of text baseline and how to extract font vertical metrics. The problem with font metrics in web is pretty much solved. For example, OpenType.js allows to extract all the metrics contained within a font file. But the actual problem arises when we want to use user fonts already present in his system. Since browser (a web app) is forbidden to get unattended access to files without explicit user interaction, pop-ups, etc., it is not possible to use tools like OpenType since it requires to know where font files are located. Moreover we can’t even reliably check which fonts are actually present in user’s system.

We had to figure out a way to detect system fonts and extract their font metrics. Luckily we were not the first who bumped into this puzzle and for us it was enough just to combine existing solutions. The trick of detecting fonts is based on a neat idea that multiple repeated letters occupy some unique width for each typeface when rendered in browser. Consequently, if selected font is not present on user’s system, then sample text is rendered with fallback font which has a different horizontal span and we can check it beforehand. Well, its actually kind of a brute-force solution, since there is a low probability of detection collisions (when two same text samples with different typefaces occupy the same horizontal space), but it actually works quite well for majority of standard font families.

Credit for font detection trick goes to Lalit Patel, see Lalit.lab for more details.

At this point popular system fonts are detected (just their names), but still we need to extract font metrics from them somehow and visualize it. None of the browsers provide out-of-the-box solution to get information about font glyphs and their metrics, so as a work around vertical metrics are analyzed and extracted with the help of graphics tools available in browser. Specifically, a chosen font is rendered in browser’s 2D canvas, then analyzed with a bit of code and scanline approach — a conventional in computer graphics family of algorithms. By detecting lower-bound and upper-bound lines for H, d, x, pglyphs individually we actually obtain main vertical metrics for selected font — cap height, ascent, x-height and descent respectively. Now, as we have system fonts and their metrics extracted, we can deduct UPM value of x-height in order to assess font legibility and visualize how much it deviates from recommended 500UPM.

Credit for dynamic font metrics extraction goes to Mike Kamermans from Mozilla Foundation team.

All this hassle with font detection & metrics extraction was done because we wanted a web app to generate rhythmic grids. It seems now we have all the information necessary to put grid together in a web layout. But hold on, not so fast. One more obstacle pops out. It appears that CSS does not provide control over text baseline positioning (this time we talk about actual baseline line, not the baseline number). And it is crucial to display rhythmic grid with pixel precision, since we want to put text body exactly on the baselines that are distributed evenly within grid. I mean, in CSS you can position text vertically however you want (relatively, absolutely, shifted by different units) and specify line height for multi-line text, but you cannot control where exactly text baseline is put for arbitrary font. A simple web layout exercise to examplify this issue: put a text, draw a horizontal line and position this line as if it was a baseline. Then randomly change typeface or font size. Try it yourself. Oops, text doesn’t stick to baseline anymore after font modifications? Sadly enough, W3C consortium states it directly — “CSS does not define the position of the line box’s baseline” (W3C CSS standard). It means you cannot specify baseline position when placing a piece of text. Therefore, every browser vendor implements automatic baseline positioning according to their beliefs, and of course they differ among browsers (not all though).

Fortunately enough, guys from Adobe community encountered similar issue, so I borrowed their strut solution for drop cap aligning (credit to Alan Stearns) and adapted it to just aligning the baseline, but for multi-line text. Try it. Strut trick still has a limitation of minimum line height possible to use, generally around 120% depending on typeface. But it’s the only viable cross-typeface and cross-browser solution I could figure out at the moment. I’d love to hear your ideas on how to achieve perfect solution with no limitations. If you want to get a better understanding of what’s going here, there is a great article by Christopher Aue explaining internals of aligning in CSS and related problems — “Vertical Align: All You Need To Know”.


Pythagoras and his followers believed that “All Things Are Numbers”. They also believed that if you want to take a step into understanding and deconstructing any types of harmony, numbers are the place to start. In our little experiment we found out that at least some aspects of composition can and should be taken through the prism of numbers. Of course, there are still plenty of undiscovered gems in the field of design & layout composition, but we believe that precise proportions can provide designers a ladder to deeper understanding of principles that lay within this strange thing that we call “beauty”. This was the first step and still more to come. Now we invite you to visit and exlore the world of computed grids by yourself and… stay tuned.