http://ssw2014.formidablelabs.com
@ryan_roemer | formidablelabs.com
Tip - space bar advances slides
Production can be a rough place for your Node.js apps. Things can go very wrong out in the wild.
Whether on PAAS, IAAS, or bare metal.
Fail and recover at multiple levels.
Let's look at failure from a system perspective.
Have a strong bias for killing the worker.
uncaughtException
, Domainsfoo.on("error")
/*global process:false */
var recluster = require("recluster");
var cluster = recluster("./server.js");
cluster.run();
// Hot reload: kill -s SIGUSR2 CLUSTER_PID
process.on("SIGUSR2", function() {
console.log("Got SIGUSR2, reloading cluster...");
cluster.reload();
});
monit
or alternativesEverything up to this point should have hot failover.
Hot failover across datacenters?
Get out of the business of failover when you don't have to do it yourself.
Don't rely on system / service resources you don't need to.
Isolate failures you can't avoid.
Look to resources you must depend on:
Node.js apps can be bad neighbors.
Data drives problem discovery and action.
Things to look for in Node.js apps...
Identify
Decide