Modular Sparkleformation cfn-init Configsets

Originally published for AWS Advent 2016

This post lays out a modular, programmatic pattern for using Cloudformation Configsets in Sparkleformation codebases. This technique may be beneficial to:

  • Current Sparkleformation users looking to streamline EC2 instance provisioning
  • Current Cloudformation users looking to manage code instead of JSON/YAML files
  • Other AWS users needing an Infrastructure as Code solution for EC2 instance provisioning

Configsets are a Cloudformation specific EC2 feature that allow you to configure a set of instructions for cfn-init to run upon instance creation. Configsets group collections of specialized resources, providing a simple solution for basic system setup and configuration. An instance can use one or many Configsets, which are executed in a predictable order.

Because cfn-init is triggered on the instance itself, it is an excellent solution for Autoscaling Group instance provisioning, a scenario where external provisioners cannot easily discover underlying instances, or respond to scaling events.

Sparkleformation is a powerful Ruby library for composing Cloudformation templates, as well as orchestration templates for other cloud providers.

The Pattern

Many Cloudformation examples include a set of cfn-init instructions in the instance Metadata using the config key. This is an effective way to configure instances for a single template, but in an infrastructure codebase, doing this for each service template is repetitious and introduces the potential for divergent approaches to the same problem in different templates. If no config key is provided, cfn-init will automatically attempt to run a default Configset. Configsets in Cloudformation templates are represented as an array. This pattern leverages Ruby’s concat method to construct a default Configset in Sparkleformation’s compilation step. This allows us to use Configsets to manage the instance Metadata in a modular fashion.

To start any Instance or Launch Config resources should include an empty array as the default Configset in their metadata, like so:

sparkleformation/templates/example_instance.rb:
resources do
example_ec2_instance do
type 'AWS::EC2::Instance'
metadata('AWS::CloudFormation::Init') do
_camel_keys_set(:auto_disable)
configSets do
default [ ]
end
properties do
...
end
end
end
end

Additionally, the Instance or Launch Config UserData should run the cfn-init command. A best practice is to place this in a SparkleFormation registry entry. A barebones example:

sparkleformation/registry/cfn_init_user_data.rb:
SfnRegistry.register(:cfn_init_user_data) do
user_data(
base64!(
join!(
"#!/bin/bash\n",
"apt-get update\n",
"apt-get -y install python-setuptools\n",
"easy_install https://s3.amazonaws.com/cloudformation-examples/aws-cfn-bootstrap-latest.tar.gz\n",
'/usr/local/bin/cfn-init -v --region ',
region!,
' -s ',
stack_name!,
" -r ExampleEc2Instance --role ",
ref!(:cfn_role),
"\n"
)
)
)
end
---
sparkleformation/templates/example_instance.rb:
resources do

example_ec2_instance do
type 'AWS::EC2::Instance'
metadata('AWS::CloudFormation::Init') do
_camel_keys_set(:auto_disable)
configSets do
default [ ]
end
properties do
...
user_data registry!(:cfn_init_user_data)
end
end
end
end

With the above code, cfn-init will run the empty default Configset. Using modular registry entries, we can expand this Configset to meet our needs. Each registry file should add the defined configuration to the default Configset, like this:

sparkleformation/registry/example_config_a.rb:
SfnRegistry.register(:example_config_a) do
metadata('AWS::CloudFormation::Init') do
_camel_keys_set(:auto_disable)
configSets do |sets|
sets.default.concat(['example_config_a'])
end
example_config_a do
packages(:some_package) do
apt do
some_package '1.0.0'
end
end
files('/path/to/file') do
content 'Content string to write to file.'
end
end
end

end

A registry entry can also include more than one config block:

sparkleformation/registry/example_config_b.rb:
SfnRegistry.register(:example_config_b) do
metadata('AWS::CloudFormation::Init') do
_camel_keys_set(:auto_disable)
configSets do |sets|
sets.default.concat(['example_config_b_1, example_config_b_2'])
end
example_config_b_1 do
packages(:another_package) do
yum do
another_package '1.0.0'
end
end
files('/path/to/json_config') do
content do
foo do
bar 'baz'
end
end
end
end
example_config_b_2 do
system_user do
groups ['group_w']
uid 500
homeDir '/opt/system_user'
end
end
end
end

Calling these registry entries in the template will add them to the default Configset in the order they are called:

sparkleformation/templates/example_instance.rb:
resources do
example_ec2_instance do
type 'AWS::EC2::Instance'
metadata('AWS::CloudFormation::Init') do
_camel_keys_set(:auto_disable)
configSets do
default [ ]
end
registry!(:example_config_b)
registry!(:example_config_a)
properties do
...
user_data registry!(:cfn_init_user_data)

end
end
end

Note that other approaches to extending the array will also work:

sets.default += [ 'key_to_add' ]
sets.default.push('key_to_add')
sets.default << 'key_to_add', etc.

Use Cases

Extending the default Configset rather than setting the config key directly makes it easy to build out cfn-init instructions in a flexible, modular fashion. Modular Configsets, in turn, create opportunities for better Infrastructure as Code workflows. Some examples:

Development Instances

This cfn-init pattern is not a substitute for full-fledged configuration management solutions (Chef, Puppet, Ansible, Salt, etc.), but for experimental or development instances cfn-init can provide just enough configuration management without the increased overhead or complexity of a full CM tool.

I use the Chef users cookbook to manage users across my AWS infrastructure. Consequently, I very rarely make use of AWS EC2 keypairs, but I do need a solution to access an instance without Chef. My preferred solution is to use cfn-init to fetch my public keys from Github and add them to the default ubuntu (or ec2-user) user. The registry for this:

SfnRegistry.register(:github_ssh_user) do
metadata('AWS::CloudFormation::Init') do
_camel_keys_set(:auto_disable)
configSets do |sets|
sets.default += ['install_curl', 'github_ssh_user']
end
install_curl do
packages do
apt do
curl ''
end
end
end
github_ssh_user do
commands('set_ssh_keys') do
command join!(
'sudo mkdir -p /home/ubuntu/.ssh && sudo curl https://github.com/',
ref!(:github_user),
'.keys >> /home/ubuntu/.ssh/authorized_keys'
)
end
end
end
end

In the template, I just set a github_user parameter and include the registry, and I get access to an instance in any region without needing to do any key setup or configuration management.

parameters(:github_user) do
type 'String'
default ENV['USER']
end
resources do
example_ec2_instance do
type 'AWS::EC2::Instance'
metadata('AWS::CloudFormation::Init') do
_camel_keys_set(:auto_disable)
configSets do
default [ ]
end
registry!(:github_ssh_user)
...
end
end
end

This could also be paired with a configuration management registry entry and the Github user setup can be limited to development:

resources do
example_ec2_instance do
type 'AWS::EC2::Instance'
metadata('AWS::CloudFormation::Init') do
_camel_keys_set(:auto_disable)
configSets do
default [ ]
end
if ENV['development']
registry!(:github_ssh_user)
else
registry!(:configuration_management)
end
...
end
end
end

Compiling this with the environment variable development=true will include the Github Configset, in any other case it will run the full configuration management.

In addition to being a handy shortcut, this approach is useful for on-boarding other users/teams to an Infrastructure codebase and workflow. Even with no additional automation in place, it encourages system provisioning using a code-based workflow, and provides a groundwork to layer additional automation on top of.

Incremental Automation Adoption

Extending the development example, a modular Configset pattern is helpful for incrementally introducing automation. Attempting to introduce automation and configuration management to an infrastructure that is actively being architected can be very frustrating—each new component require not just understanding the component and its initial configuration, but also determining how best to automate and abstract that into code. This can lead to expedient, compromise implementations that add to technical debt, as they aren’t flexible enough to support emergent needs.

An incremental approach can mitigate these issues, while maintaining a focus on code and automation. Well understood components are fully automated, while some emergent features are initially implemented with a mixture of automation and manual experimentation. For example, an engineer approaching a new service might perform some baseline user setup and package installation via an infrastructure codebase, but configure the service manually while determining the ideal configuration. Once that configuration matures, the automation resources necessary to achieve it are included in the codebase.

Cloudformation Configsets are effective options for package installation and are also good for fetching private assets from S3 buckets. An engineer might use a Configset to setup her user on a development instance, along with the baseline package dependencies and a tarball of private assets. By working with the infrastructure codebase from the outset, she has the advantage of knowing that any related AWS components are provisioned and configured as they would be in a production environment, so she can iterate directly on service configuration. As the service matures, the Configset instructions that handled user and package installation may be replaced by more sophisticated configuration management tooling, but this is a simple one-line change in the template.

Organization Wide Defaults

In organizations where multiple engineers or teams contribute discrete application components in the same infrastructure, adopting standard approaches across the organization is very helpful. Standardization often hinges on common libraries that are easy to include across a variety of contexts. The default Configset pattern makes it easy to share registry entries across an organization, whether in a shared repository or internally published gems. Once an organizational pattern is codified in a registry entry, including it is a single line in the template.

This is especially useful in organizations where certain infrastructure-wide responsibilities are owned by a subset of engineers (e.g. Security or SRE teams). These groups can publish a gem (SparklePack) containing a universal configuration covering their concerns that the wider group of engineers can include by default, essentially offering these in an Infrastructure as a Service model. Monitoring, Security, and Service Discovery are all good examples of the type of universal concerns that can be solved this way.

Conclusion

cfn-init Configsets can be a powerful tool for Infrastructure as Code workflows, especially when used in a modular, programmatic approach. The default Configset pattern in Sparkleformation provides an easy to implement, consistent approach to managing Configsets across an organization–either with a single codebase or vendored in as gems/SparklePacks. Teams looking to increase the flexibility of their AWS instance provisioning should consider this pattern, and a programmatic tool such as SparkleFormation.

For working examples, please checkout this repo.