New site: Loading Jekyll on Sinatra and deploying on Heroku

December 10, 2016

It was time to refresh my personal blog. My wordpress site had been running since 2012, I was getting a bit fed up with how hard it was to change the CSS, and I wanted a domain where I could easily host other code projects.

So I pulled the content off the old wordpress template I’d had it on, and built a custom website in the ruby framework Sinatra, using the ruby-based blog creator Jekyll.

This is how I did it. The code, the pitfalls and some of the emotions I felt on my journey.

Getting stuff out of Wordpress and cleaning the junk off it

The first step was to grab the old content. There weren’t loads of posts but I wanted to transfer them and all the metadata.

This blog about how to migrate from wordpress to jekyll was a helpful reference. Though I didn’t follow all of its recommendations.

Wordpress allows you to export all your posts and metadata, but they come out in a zip file surrounded by a lot of junk html.

I used the gem Downmarkit to load in the html posts and turn out text formatted in markdown, the format that Jekyll blogs use.

I had to tidy up the resulting markdown files, but this proved to be a godsend, because it was a fairly quick way to clean up a lot of files. And it autogenerated not just formatting but the front-matter (key: value headings for attributes like author and publication date) that Jekyll uses as metadata.

This is the translator.rb script I used to run the posts through the DownMarkit convertor. Of course you’ll need to change the directory locations to match your project. This prints out the markdown after the html, something I wanted to do so I could check it was working.


    %w(active_support rubygems sequel fileutils yaml active_support/inflector).each{ | g |
      require g}

    require_relative 'downmark_it/downmark_it'

    all_files = Dir[ 'blog/_posts', '_posts/*.html']

    def transform(all_files)
      all_files.each do |file|

        blogpost = File.open(file, "r+")
        contents = blogpost.read

        content = DownmarkIt.to_markdown contents
        blogpost.write content
        blogpost.close
      end
    end

    transform(all_files)

Getting Jekyll to work

Once I had the folder of markdown posts, I put them into a _posts directory as per the traditional Jekyll folder structure. There are a lot of helpful guides on the Jekyll website, a gem that lets you get a boilerplate site off the ground quickly, plus examples of how a vanilla jekyll project works.

Getting Sinatra and Jekyll to work together

However, it was when I got to mounting the Jekyll blog on a basic Sinatra framework, that I got frustrated to the extent that I wished I hadn’t used Jekyll.

1) Figuring out the goddamn directory structure

After faffing around with the Jekyll and Sinatra parts in entirely separate diretories, and then having them all in the same root directory, I eventually figured out the current structure, that can be seen in the site github.

Basically I put the config.ru file for Sinatra, and the config.yml file for Jekyll both in the root directory. I allowed Jekyll to pick up markdown files from the default _pages directory and render them into blog/_site. It uses _layouts to display them correctly - as per a standard Jekyll install. Sinatra ignores the Jekyll specific directories (_posts, _layouts, etc) and Jekyll ignores the Sinatra specific directories. (Using the excludes command I mention more below).

2) Build a Sinatra route to pick up the rendered html and display it as a webpage

This post by Derek Eder on setting up a Jekyll blog in Sinatra was particularly helpful. Basically, given that you have a directory of newly-minted html files, make sure they can be found when someone clicks on a link like http://mysite.com/blog/this-great-post.

The route below takes all requests to the site starting with /blog and grabs the path blog/this-great-post from the request object.

  get '/blog/?*' do
    jekyll_blog(request.path)
  end

You’ll notice it takes the path and sends off to a method jekyll_blog. This is what that method does:

  def jekyll_blog(path)
    file_path = File.join(File.dirname(__FILE__), '_site', path.gsub('/blog', ''))
    file_path = File.join(file_path, 'blog.html') unless file_path =~ /\.[a-z]+$/i

    if File.exist?(file_path)
      file = File.open(file_path, 'r')
      contents = file.read
      file.close
    end
    contents
  end

The first two lines look confusing, but all the first one does is take the blog/this-great-post name and turn it into http://mysite.com/_site/this-great-post - the location at which the corresponding blog post can actually be found.

File.join returns a new string formed by joining the strings using a slash.

As as a safety precaution, the second line returns the blog homepage if illegal characters are found in the url requested.

If a file exists at that location, sinatra opens it, reads it out and serves up its contents, ie. a full html file.

3) Oh god, where did all the pictures go? - aka configuring the asset routes

I wanted to keep all the images in one directory, avoiding having to host the blog ones in one place, and the site ones in another. The solution I reached was to host all the images Sinatra style in a public/images directory.

To make these files accessible to the blog, I had to make sure the markdown posts were pointing to the correct images. In the markdown files from which the posts are generated, I wrote image links like this: (/images/cat-picture.png) In my config.yml file for Jekyll, I set the image_base property for the site as: image_base: '/images'. So when the file is rendered into html, it comes out like:

  <img src="/images/cat-picture.jpg" alt="wow cat" />

If you’re wondering what happened to the “public” in the public/images directory above, it’s a Sinatra convention that assets are stored in there, but that the public directory doesn’t appear in the path. So http://www.mysite.com/images/cat-picture.jpg will work, even though there’s this extra “public” level. There is more about this is the sparse Sinatra documentation and also StackOverflow etc.

4) Exclude all the Sinatra crap that gets tangled up in Jekyll

Having Jekyll and Sinatra running from the same root folder caused a lot of things to get tangled up. Use the excludes setting in Jekyll’s config.yml to add all directory names that Jekyll doesn’t need to know about. (See also below for more on the excludes property).

5) Create a dev and production environment for Jeykll

By setting up a different configuration file _config_dev.yml and directing Jekyll to use it when I was building the Jekyll site in local, I made it much easier to specify different environment variables. All I really needed was the site.url

  url: "http://localhost:3000"

To run the site with this local configuration I used this command: it adds all the configuration from _config.yml, then checks _config_dev.yml, and if it finds duplicate values, gives the _config_dev.yml values precedence. I only had one thing in there.

  jekyll build --config _config.yml,_config_dev.yml

Deploying the lot to Heroku

Heroku is nice to use. However a few quirks of how I set up the site made this harder than I expected to deploy. Despite seeing advice to use the PHP engine and to install various buildpacks I ended up just using the Ruby build engine.

Critically, setting a Procfile helped a lot. Really, all the site needs to do is build the jekyll site before running the Sinatra commands. Then when Sinatra runs on the serve command, the posts are all prebuilt as html files, and Sinatra has no need to know how they got there.

This is my simple Procfile:

  web: jekyll build && serve

More weird errors? - this could be the solution

If you’re getting strange errors from the jekyll build command, one command that helped me was adding even more to the excludes array in Jekyll’s config.yml file.

  exclude: ['views', 'Gemfile', 'app.rb', 'config.ru', 'README.md', 'Gemfile.lock', 'vendor', 'bin']

As the name suggests, exclude stops Jekyll trying to render stuff that is to do with Sinatra. I initially hadn’t included bin or vendor as I hadn’t created those directorires myself. However, this saved me a lot of problems.

See all the code

Tweets

Follow me on Twitter:

Tweet Follow @