Hosted by Three Crickets

Prudence
Scalable REST Platform
For the JVM

Prudence logo: bullfinch in flight

Deployment

"Deployment" here refers to getting your Prudence applications running so that users can, well, use them. The challenge is that the development environment is often quite different from the production, staging and testing environments, and indeed it's the point where development work must integrate with systems administration and operations.
Approaching deployment can quickly mire you into a comparison of ideologies of project management: some prefer Continuous Integration (CI) and agile methods, others prefer more careful deployment by humans according to step-by-step plans. We'll bypass the ideological discussion here, and deal specifically with the technical possibilities and tools. It's up to you to decide which deployment technologies fit best with or best enable your project management ideology.

Deployment Strategies

File Synchronization

Sometimes the best strategies are the most straightforward.
Because your entire Prudence installation is contained in one directory, you can simply copy it from your development environment to your deployment environments. Even better, you can use a synchronization tool like rsync, which will efficiently copy only the updated/new files. Best of all, you can use a two-way synchronization tool like Unison, which allows on-the-fly changes you make at the deployment environment to synchronize back to your development environment. Both tools mentioned use compression and batch transfers for speed and can run over SSH for security.
Actually, the deployment origin does not have to be a single programmer's development environment: you can create a dedicated deploying environment from which to deploy to all nodes and have it shared among a team of programmers.
Example
Let's see how this is done using Unison. First, let's create a profile in our development environment. We'll store it in "~/.unison/production.prf":
root = /path/to/prudence
root = ssh://node1.mysite.org//path/to/prudence
root = ssh://node2.mysite.org//path/to/prudence
ignore = Path cache
ignore = Path logs
ignore = Path component/applications/stickstick/data
ignore = Path .git 
ignore = Path .gitignore
Note how easy it is to include several nodes in a single profile. Also note that likely want to exclude syncing a few localized directories: in this example, we're ignoring an extra data path and also files used by our version control system, Git.
To synchronize with the above profile, run this:
unison production -batch
Of course, you can create additional profiles, for example "staging.prf" for your staging environment.

Version Control Systems (VCS)

If you're already using a VCS, why not use it to deploy your applications, too? In many ways, this is as straightforward as file synchronization, though there are a few important advantages and disadvantages:
You can also adopt a hybrid strategy: use VCS to deploy the main application code, and install the other parts of it (including Prudence itself) via some other means.
Be sure to read the Sincerity tutorial, which gives you a few suggestions to using a VCS with Prudence, which apply to development as well as to deployment.

Packaging

You can encapsulate your entire Prudence container, or individual applications and services, into deployable, versioned, interdependent, signed packages.
Though configuring and creating the packages is the hard part, deploying them is often very easy. Packaging's big advantage in how easy and safe it makes uninstalling, upgrading and downgrading processes. The strategy indeed reveals many of its advantages when it is used modularly: it makes it possible to install/upgrade/downgrade only specific applications or services, while leaving the rest of the deployment intact. Different types of nodes could thus be installed as assemblages (meta-packages) of particular packages. Finally, careful management of dependencies can be used to ensure that the package has everything it needs to run properly.
Modularity has a huge cost in terms of project management complexity, which should not be underestimated. However, if you're already managing your project as separate modules, with their own roadmaps and version progression, it can make a lot of sense to deploy it that way, too.
There are many packaging standards and tools out there, but we'll mention a few that you are especially likely to use with Prudence.
Docker
Because everything is one directory, and the only requirement is a JVM (actually, just a JRE), it's trivial to package your Prudence containers in Docker. See the Sincerity Manual for instructions.
Maven
Apache Maven is a comprehensive (and highly complex) project management tool for the JVM, especially targeted at the Java language and related technologies. Whether or not you use the Maven tool itself, its repository format (also known as iBiblio) has become the de facto standard for JVM packaging.
The Sincerity tool, on which Prudence is itself distributed, uses the Maven repository format, but adds a few important (and optional) features to its packaging specification, namely the ability to unpack archives into the container, and to run install/uninstall hooks for each package. We recommend Sincerity for Maven-type package deployment: it will handle not only your own packages, but also Prudence itself, as well as other Sincerity plugins and add-ons.
You can package and publish your packages using the Maven tool: the Sincerity packaging documentation includes a template configuration for Maven, which you can easily modify for your own packages (and bypass Maven's notorious learning curve). Alternatively, you can use easier tools like Gradle and Ivy.
Maintaining and managing the repository is easy enough. At its most straightforward, you can simply host the repository's filesystem via a web server. However, there are also several powerful tools and hosted solutions offering many useful features, such as proxying of other repositories. Sonatype's Nexus is especially easy to install and get running using Sincerity. Another great option is JFrog's Artifactory.
Debian and RPM
If you are deploying to nodes based on a Linux-based operating system, then you're are likely already using a packaging system: either Debian or RPM. Using the native packaging system for your own deployment gives you the very useful advantage of allowing for dependencies to OS packages, as well as having a single, unified packaging system for everything. At the very least, for example, you'll want your Prudence packages to depend on a JVM. By creating a meta-package for each node type, you can then install and upgrade entire nodes, starting from a freshly installed operating system, by simply installing a single package.
Debian is the the more complex of the two standards: it's actually not just a file format, but part of a comprehensive, integrated operating system build system, requiring several highly specific configuration files per package. It might be easier to use more minimal tools for packaging your Prudence applications: we recommend jdeb for Debian and Redline for RPM, which both run on the JVM and can be integrated into Ant builds.
Another important advantage of using Debian or RPM is that you can integrate your Prudence packages into comprehensive infrastructure management and orchestration tools. There is a great variety among these: some are tied to specific operating systems, some are hosted, some proprietary. If you're using Ubuntu, you can use Juju and Landscape. For RedHat and CentOS, you can use YADT, which can also be used for your build process.

Load Balancing and Proxies

One of the great advantages of REST architectures is that they're trivial, in an architectural sense, to scale horizontally: any number of identical nodes can sit comfortably behind a load balancer. Because each REST request is self-contained, it doesn't matter which node handles which request.
Well, that's a bit idealized. Actually, requests are not themselves identical: some might need access shared resources, such as databases and task farms, and in complex applications it may make sense to have different kinds of nodes answering different kinds of requests, or at the very least it may be important to route certain requests to certain nodes that would do a better job at servicing the request (for example, if they are nearer to the specific resources the request needs). Your routing needs might be quite sophisticated. See a more comprehensive discussion of the "partitioning" problem in the scaling tips article. Nevertheless, for simpler applications load balancing is indeed trivial, and ready-made products, services and algorithms will fit most use cases.

Clusters

You don't have to enable clusters in order to create a load-balanced Prudence deployment. However, you can use the cluster features to allow for powerful cooperation between nodes. In particular, they can share a Hazelcast cache backend.
Also take a look at the task farming feature: you can run a separate task farm cluster without the application nodes forming a cluster themselves.

Choices

Good load balancers do more than just scale: they allow for robustness by removing problematic nodes from the pool, either because of errors or because of poor performance. Often, you can configure the various thresholds and the behavior of the back-off algorithms.
If you're deploying to a hosted "cloud" environment, it could be that your host provides a load-balancing solution. It's often a good choice: these load balancers will likely perform better and be more reliable than those running inside a virtual host. However, they may not be flexible (or trustworthy) enough for your needs. It's easy enough to install your own load balancer using a wide range of products: we'll provide examples for using two popular solutions below.
Who should handle SSL? Prudence can handle SSL perfectly fine on its own. However, when using a load balancer, you may have the option of "terminating" SSL there. The problem with terminating SSL early is that you have unencrypted packets moving between the load balancer and your nodes, which is a security risk. You definitely want SSL going all the way to Prudence if you're deployed in an environment you can't trust. Otherwise, terminating early is often recommended, as it can offer better utilization of your application node resources, and allow for simpler deployments. Both examples below demonstrate how to terminate SSL at the load balancer.

Nginx

Nginx is a popular general-purpose web server, which has several high-quality modules. It's a good choice if you need other features in addition to load balancing, but also works fine as a standalone load balancer. Refer to the documentation for the proxy and upstream modules for a complete reference.
Here's a simple configuration:
http {
  server {
    listen 80;
    location / {
      proxy_pass http://prudence;
    }
  }
​
  server {
    listen 443 ssl spdy;
    ssl on;
    ssl_certificate_key /etc/ssl/server.key;
    ssl_certificate /etc/ssl/server.crt;
    ssl_protocols TLSv1 TLSv1.1 TLSv1.2;
    ssl_ciphers 'ECDHE-RSA-AES128-GCM-SHA256:ECDHE-ECDSA-AES128-GCM-SHA256:ECDHE-RSA-AES256-GCM-SHA384:ECDHE-ECDSA-AES256-GCM-SHA384:DHE-RSA-AES128-GCM-SHA256:DHE-DSS-AES128-GCM-SHA256:kEDH+AESGCM:ECDHE-RSA-AES128-SHA256:ECDHE-ECDSA-AES128-SHA256:ECDHE-RSA-AES128-SHA:ECDHE-ECDSA-AES128-SHA:ECDHE-RSA-AES256-SHA384:ECDHE-ECDSA-AES256-SHA384:ECDHE-RSA-AES256-SHA:ECDHE-ECDSA-AES256-SHA:DHE-RSA-AES128-SHA256:DHE-RSA-AES128-SHA:DHE-DSS-AES128-SHA256:DHE-RSA-AES256-SHA256:DHE-DSS-AES256-SHA:DHE-RSA-AES256-SHA:AES128-GCM-SHA256:AES256-GCM-SHA384:AES128-SHA256:AES256-SHA256:AES128-SHA:AES256-SHA:AES:CAMELLIA:DES-CBC3-SHA:!aNULL:!eNULL:!EXPORT:!DES:!RC4:!MD5:!PSK:!aECDH:!EDH-DSS-DES-CBC3-SHA:!EDH-RSA-DES-CBC3-SHA:!KRB5-DES-CBC3-SHA';
    ssl_prefer_server_ciphers on;
    ssl_session_timeout 5m;
    ssl_session_cache shared:SSL:5m;
    location / {
      proxy_pass http://secure_prudence;
    }
  }
​
  upstream prudence {
    server node1.myapp.org:8080;
    server node2.myapp.org:8080;
    server node3.myapp.org:8080;
  }
​
  upstream secure_prudence {
    server node1.myapp.org:8081;
    server node2.myapp.org:8081;
    server node3.myapp.org:8081;
  }
}
Nginx offers some useful routing features. For example, you can gives nodes "weights," where a higher weight means that more requests will be sent to that node:
upstream prudence {
  ip_hash;
  server node1.myapp.org:8080 weight=1;
  server node2.myapp.org:8080 weight=2;
  server node3.myapp.org:8080 weight=4;
}
Or you can enable client IP-based routing, so that requests from a particular client will always go to the same node:
upstream prudence {
  ip_hash;
  server node1.myapp.org:8080;
  server node2.myapp.org:8080;
  server node3.myapp.org:8080;
}

Perlbal

Perlbal is a minimalist web server dedicated solely to load balancing. It offers very few configuration options, but is eminently hackable due to being written in crisp Perl. It is recommended for users who don't like fiddling with knobs or who appreciate single-purpose tools.
Here's an example "perlbal.conf":
CREATE POOL pool
  SET nodefile = /etc/perlbal/nodes
​
CREATE POOL secure_pool
  SET nodefile = /etc/perlbal/secure_nodes
​
# HTTP
CREATE SERVICE balancer
  SET listen          = 0.0.0.0:80
  SET role            = reverse_proxy
  SET pool            = pool
  SET verify_backend  = on
​
# HTTPS
CREATE SERVICE secure_balancer
  SET listen          = 0.0.0.0:443
  SET role            = reverse_proxy
  SET pool            = secure_pool
  SET verify_backend  = on
  SET enable_ssl      = on
  SET ssl_key_file    = /etc/ssl/server.key
  SET ssl_cert_file   = /etc/ssl/server.crt
  # The following is recommended to work around a bug in older versions of IE
  # (the default is ALL:!LOW:!EXP)
  SET ssl_cipher_list = ALL:!ADH:!EXPORT56:RC4+RSA:+HIGH:+MEDIUM:+LOW:+SSLv2:+EXP:+eNULL
​
# Internal management port
CREATE SERVICE mgmt
  SET role   = management
  SET listen = 127.0.0.1:60000
​
ENABLE balancer
ENABLE secure_balancer
ENABLE mgmt
The "nodes" file is a list of IP addresses (not hostnames!) with ports. We'll add three Prudence instances running at the default server port:
192.168.1.10:8080
192.168.1.11:8080
192.168.1.12:8080
The "secure_nodes" file is the same, but uses our separate server port:
192.168.1.10:8081
192.168.1.11:8081
192.168.1.12:8081
If the node files are edited, Perlbal will pick up their changes on the fly.

Web Data

The load balancer handles user requests instead of your Prudence instances. (This is sometimes called, from the client's perspective, a "reverse" proxy.) This means that request headers might be modified before they reach you, and response headers after they leave you.
Your Host
One thorny issue is that the host and port (and scheme, if you are proxying https to http) you get are different from those the client sent. For example, the client might have made a request to "https://myapp.org/myapp/" (at port 443), but it reaches your node as "http://192.168.1.10:8081/myapp/".
In terms of routing your URI-space, this is not a problem: Prudence always uses the URL in the original request. However, you might care about which server you are on, for example if you need to access local services.
Usefully, the standard HTTP/1.1 "Host" header can be used here: your load balancer will likely set it to be your server. In Prudence, you can access its parsed value via the conversation.request.hostRef API:
var hostRef = conversation.request.hostRef
var host = hostRef.hostDomain // '192.168.1.10'
var port = hostRef.hostPort // 8081
var protocol = hostRef.scheme // 'http'
The same value is used for virtual host configuration.
In some cases you might want the load balancer to work transparently, leaving the original "Host" header intact. This behavior is sometimes configurable in load balancers.
For example, in Nginx:
location / {
  proxy_pass http://prudence;
  proxy_set_header Host $host;
}
Forwarded Headers
One problem with the standard "Host" header is that it only contains the host and port, but not the scheme. If you're proxying https to http, and are setting the "Host" header to work transparently, you will nevertheless lose the original scheme.
Though the HTTP/1.1 specification does not have a solution to this problem, we can use a widely supported de facto standard, first introduced in Squid: the "X-Forwarded-Proto" header. Also, "X-Forwarded-Host" (and/or "X-Forwarded-Port") can be used instead of "Host", allowing you to retain the "Host" of the proxy without overwriting it.
You can enable support of these headers in Prudence per application by setting "app.settings.routing.useForwardedHeaders" to true in your application's settings.js:
app.settings = {
	...
	routing: {
		useForwardedHeaders: true
	}
}
By default, Prudence does not enable the interpretation of these headers, because if you're not behind a proxy, it would allow clients to manipulate the information.
Note that you can also use the ForwardedFilter directly for more fine-grained control over which requests will use these headers.
Your load balancer must also be configured to set these headers. For example, in Nginx:
location / {
  proxy_pass http://prudence;
  proxy_set_header X-Forwarded-Proto $scheme;
  proxy_set_header X-Forwarded-Port $server_port;
}
Client IP Address
Prudence's conversation.client.upstreamAddress API identifies the request's client, however, in a load-balancing scenario, the client is actually the load balancer itself. This is tricky: there actually might be various components (load balancers, caches, gateways) along the way, so how can the original IP address be preserved?
We can use the "X-Forwarded-For" header (a de facto standard), which is a comma-separated list of all client IP addresses in order. Each component along the way can append itself to before forwarding onward. The first IP address would thus be the original client.
The default server configuration in Prudence does not enable the interpretation of this header, because if you're not behind a proxy, it would allow clients to manipulate the information. To enable it, uncomment or add this line in "/component/servers/http.js":
server.context.parameters.set('useForwardedForHeader', 'true')
See the relevant Restlet documentation here and here. Also note that you can use conversation.client.forwardedAddresses API to access the complete list.
Your load balancer must also be configured to set "X-Forwarded-For". For example, in Nginx:
location / {
  proxy_pass http://prudence;
  proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
}

Adaptable Configurations

The Sincerity tool, which is used to bootstrap Prudence, gives you a lot deployment power.
In particular, the \ means that you don't have to create separate configuration files for different deployment environments: you can use the same deployed files for development, staging, testing and production.
This depends on the scripts discovering in what environment they are running, and performing the appropriate configuration. Here are a few suggested discovery methods:
By System Configuration File
You can configure your node via a single file. For example, let's create "/etc/deployment":
DEPLOYMENT=production
In our Prudence bootstrapping scripts, we can parse the file safely like so:
document.require('/sincerity/jvm/')
var deployment = 'staging'
try {
deployment = Sincerity.JVM.fromProperties(Sincerity.JVM.loadProperties('/etc/deployment')).DEPLOYMENT
} catch(x) {}
println('Deployment: ' + deployment)
​
if (deployment == 'development') {
	...
}
Note that we are defaulting to "staging" in case the file doesn't exist. Defaulting to "development" might be risky: development deployments usually reveal too much data, or otherwise provide security overrides.
The advantage of this file format is that it's very easy to parse in other languages, so you can use it to configure non-Prudence components, too. Here's an example in bash:
if [ -f /etc/deployment ]; then
	. /etc/deployment
	echo "Deployment: $DEPLOYMENT"
else
	DEPLOYMENT=staging
	echo "Deployment: staging (default)"
fi
By IP Address
Your deployment environment might be known according to the local IP address or subnetwork. It's possible to lookup the IP address in a table, or other parse parts of it (the subnetwork).
To retrieve it, the following might work:
var address = java.net.InetAddress.localHost.hostAddress
This API won't work as expected in some local configurations, instead returning the loopback address, "127.0.0.1". Also, some nodes may have multiple IP addresses. To iterate all local IP addresses, use this code:
var addresses = []
for (var e = java.net.NetworkInterface.networkInterfaces; e.hasMoreElements(); ) {
	var interface = e.nextElement()
	for (var ee = interface.inetAddresses; ee.hasMoreElements(); ) {
		var address = ee.nextElement()
		if (!address.loopbackAddress) {
			addresses.push(address.hostAddress)
		}
	}
}
By "Cloud" API
If you're running in a "cloud" environment, your host likely provides you with an API service to discover your node's name, group it belongs to, etc., making it easy to determine your deployment environment.
Many cloud environments support the OpenStack API, which you can access easily directly via RESTful requests or a dedicated JVM wrapper. Specifically, its identity service can be used to retrieve information about the current node.
For example, let's list the tenants of our node:
document.require('/prudence/resources/')
var openstackBaseUri = ...
var tenants = Prudence.Resources.request({
	uri: openstackBaseUri + 'v2.0/tenants',
	mediaType: 'application/json'
})
for (var t in tenants) {
	var tenant = tenants[t]
	println(tenant.name)
}
Note that the above code will only work from a running Prudence application: during bootstrapping, you will have to use a different method to make REST requests, for example the dedicated JVM wrapper.

Operating System Service (Daemon)

In production environments it's recommend to have Prudence installed as an operating system service ("daemon"), which allows you to use standard tools to start, stop and monitor its status. It will also guarantee that Prudence starts as quickly as possible when the system starts, reducing downtime in the cases of restarts.
In Prudence this is best handled by installing the Sincerity service plugin, which adds a powerful native "wrapper" service around the JVM. The documentation there also provides you with examples of how to install it in your operating system.
Note that the wrapper uses its own logging system, separate from the one used by Prudence, though the default configurations of both will work well together.

Security

Securing your deployment is important whether you're running an Internet site or a local intranet site. Actually, "security" is an umbrella term for various aspects dealing with, ahem, unintended use of your site:
It's crucial that you study this topic well or hire an expert to handle it for you. It's astounding how often big, "trusted" companies fail miserably in securing their applications. Don't wait until the fire starts to put it out: prevent it from ever happening.
Unfortunately, it's not a simple problem field: attacks are getting more and more sophisticated, using sheer computational prowess to break through encryption, "rainbow" attacks to uncover user passwords, and finding clever loopholes for injecting code into your application or database runtime. We definitely can't cover all aspects of security here, especially because it's a moving target. But we'll cover some essentials that are directly related to Prudence and the deployment technologies mentioned in this chapter.
Is Prudence less secure, as an open source product, than proprietary alternatives? To be honest, there's something scary about revealing all your cards to the hackers. At the same time, these cards are also seen by the community of users, like you, who have an interest in plugging security holes as soon as they're discovered. With closed-source software, ignorance is bliss: you have to rely on indirect knowledge to evaluate just how secure it is. And if a loophole is discovered, you have to rely on others to fix it. With Prudence, you don't have to wait for us: we've gone to great lengths not only to share the code, but also to make it as easy as possible for you to build your own patched-up Prudence. When all the factors are considered, we believe that open source is the better choice in the long run for those who care about security.

Locked-Down User

When applications are hacked, your only line of defense is the operating system. And let's put is plainly: all applications have exploits. Thus, especially because it's so easy, there's no excuse not to implement basic operating system security.
The first thing you should do is create a special user that will spawn and thus own the Prudence process. If you're running Prudence as a service, then you should install that service via that user.
Then, lock that user down. All operating systems let your control file access per user, so make sure the user can only exact only the files it needs. Most operating systems are too promiscuous by default, so make sure that nothing important is readable by your designated user.
But accessing the files is only the tip of the iceberg: you want Mandatory Access Control (MAC), too, to limit the user's ability to execute processes it shouldn't. If you're deploying on Linux, consider using AppArmor, which allows for simple profiles to configure the use of Linux's many security features.

Firewall

Whether you're running a cluster or a single server, there's no reason unused ports should be open to the world, and to mischief. Get yourself a firewall.
If you're deploying on Linux, iptables (a netfiler module) is the standard solution. However, it can be quite daunting to manage directly, so consider using a higher-level tool instead: we recommend the Uncomplicated Firewall (UFW), which allows for per-application profiles, and comes with sensible iptables defaults.
Which ports should you leave open?
HTTP and HTTPS
The standard and default Internet ports for these are 80 and 443 respectively. Of course, you are free to use other ports for your deployment, for which there might be security advantages.
Cluster
Are you running your nodes as a cluster? The default port used by Hazelcast is 5701, though it's easy to change it.
Cache Backends
Do you need access to a shared cache backend?

HTTPS

Enabling TLS/SSL encryption for HTTP is usually more about protecting your user data than protecting your own. And indeed, this service you provide to your users is costly in terms of CPU cycles it requires and certification authorities you must pay for generally trusted certificates.
HTTPS also has an entirely different use case, specifically when it's used with privately issued certificates: you gain a powerful authentication barrier. Only users who own the certificate would be able to access the protected resources.
You have the option of handling SSL directly in Prudence or "terminating" at the load balancer.

HTTP Authentication

Prudence supports HTTP basic authentication, which is in turn supported by most web browsers and other HTTP clients. Coupled with HTTPS (in order to ensure that the password is transferred securely to you), it's actually not such a bad way to secure your resources. Web browsers will cache user credentials for the duration of a "session," which usually means until the web browser is closed.
HTTP authentication is secure enough when coupled with HTTPS, but it's not a scalable way to support many users and sessions. If you need a comprehensive solution, consider Diligence's authentication service.

Quarantine the Admins

Many sites allow for "administrative" logins that have special privileges and abilities. Probably, these abilities would be utterly destructive in the wrong hands.
It thus makes a lot of sense to quarantine these users and their needs into a separate security domains: it should be harder for an admin to login than for a regular user. All defenses should be up: HTTPS with private certificates, a separate cluster firewalled from the regular application nodes, and sharing only the resources that are absolutely necessary for admin functionality.

The Prudence Manual is provided for you under the terms of the Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License. The complete manual is available for download as a PDF.

Download manual as PDF Creative Commons License