Intro to DRb

  • What is DRb
  • Components of a DRb Application
  • Simple Example - What Am I Listening To?
    • Server
    • Client
  • Concurrency
  • Security

What is DRb

So, what is this DRb thing and why should you be interested? DRb literally stands for "Distributed Ruby". It is a library that allows you to send and receive messages from remote Ruby objects via TCP/IP. Sound kind of like RPC, CORBA or Java's RMI? Probably so. This is Ruby's simple as dirt answer to all of the above.

Why might you want to invoke remote code? Lots of reasons, potentially. One of the easiest to imagine is that a site might want to offer an externally available service without actually providing any code. DRb lets you do just that. A service provider could publish a Ruby object, along with a simple interface definition, and clients could connect and send predefined messages. A good example might be an online auction site opening its interfaces to automation from external programs. This would allow clients to define their own business logic for how to transact with an auction without having to deploy any auction site code to the client or client code to the server.

Or, as another example, imagine you have some heavy processing work you need done. And, perhaps you have a few extra PCs (maybe a million or two), and you wanted to do something like--say--search for intelligent alien life. Distributed Ruby would facilitate multiple cooperating clients to interact with a central server, responsible for managing the checkin/checkout of data sets to analyze.

Components of a DRb Application

As you might guess, there are two basic components to any DRb application: some kind of client and some kind of server. With each of these two roles comes some responsibility. Here is a quick breakdown:


Server

  • Start a TCP server socket and listen on a port.
  • Bind an object to the drb server instance
  • Accept connections from clients and respond to the messages they send
  • Optionally provide Access Control services
Client
  • Establish a connection to a DRb server process
  • Bind a local object to the remote DRb object
  • Send messages to the server object and receive its responses
As you can see, the concepts of DRb are very intuitive to anyone who is familiar with the fundamentals of how the internet works.

Simple Example - What Am I Listening To?

What follows is a very simple (and obviously contrived) example of how to use DRb for a real application.

The Object: Provide a way to dynamically "update" my website with which song I'm currently listening to. (I told you it was contrived)

I want a way to dynamically (and in real-time) let viewers of my website find out which song I have playing on my mp3 player at home. As mentioned above, in planning the architecture for this application, there are two major components that I need to be concerned with: client and server. Just for fun, I'm going to do this a bit backward from what intuition might dictate. We'll start with the server (my laptop):

Server

Yes, that's right. My laptop is the server in this example (instead of my web server being the server). I chose this configuration, because I want the web server (client) to actually ask my home computer what song is playing at any given moment in time. With this design, I can change the underlying algorithm (which for now, will be a simple read from a text file) to something more reliable. The client processes won't need to know the details. They'll simply ask for an answer.

As I mentioned above, the responsibilities of the server are to start a server socket, bind an object to that socket, respond to messages sent to the socket, and provide access control. I'll go through these responsibilities one at a time.

Fortunately, DRb takes care of the the first and second responsibilities seamlessly and painlessly. All that needs to be done is to call the DRb class method DRb#start_service, passing in a druby URI and an object to bind to it:

DRb.start_service("druby://:7777", SongNameServer.new("/tmp/songname"))


The URI in this case is the piece that specifies which port to start listening on. In our example, it is hard-coded to port 7777. This parameter can optionally accept nil, in which case it will dynamically find an available port to listen on. The port dynamically chosen can be obtained via DRb.uri, which will return the full druby URI (e.g. druby://hal9k:2001).

The second argument to start_service is an instance of the object which we want to be remotely messagable. In this case, I have created a class called SongNameServer, which will respond to the message songname.

class SongNameServer
	def initialize(str)
		@filename = str
	end

	def songname
		f = File.new(@filename)
		return f.readline
	end
end
The final step is to join on the DRb server thread:
DRb.thread.join
This step ensures that the DRb server thread completes its work before the process exits. For more information on Threads, look at Library reference on Ruby Central.

Client

Now that we've gotten the Server functioning and provided an interface to our SongNameServer, it's time to write a client. What we know so far is that there is a server on host hal9k, port 2001 and that there is an object bound to that server that accepts the message songname.

Again referencing the responsibility list from above, we know that we must first connect to the DRb server process via a socket and bind a local object to the remote DRb object (SongNameServer instance). Here's how we do it:
DRb.start_service
ro = DRbObject.new(nil, 'druby://hal9k:2001')
As before, druby handles the socket connectivity for you. All that is necessary to create a local proxy to the remotely listening ruby object is to start the DRb service (which performs some DRb module initialization) and to then instantiate a new DRbObject, passing in the URI of the remote druby process as its second parameter.

The end result of the above code is that we now have a live object that will accept the same messages that our remote object accepts (songname, for example) and will actually proxy messages and responses between our client and the remote ruby object(s). We are now "done" with the DRb part of the code. We can send the songname message to our new object and (hopefully) receive a song name (as a String) in response:

print ro.songname -> "Also Sprach Zarathustra, Op. 30: Sunrise"
...and we're there!

Code listings for client and server.

Concurrency

Until now, we've lived in our protected little world where there is one server and one client and everyone's always happy. In the harsh light of reality, our outlook seems a little less automatic. When we pass an instance of an object into the DRb service to start the server, we are truly passing a single copy of the object in to answer all messages. Because of this, it's possible to generate some unpredictable result when not careful. Take the following scenario (Split in two to represent two concurrent sessions):
Client A connects  
Client A inserts context-specific member data into DRb object as a message argument Client B connects
Server executes message, calling private messages which rely on this member variable. Client B inserts context-specific member data into drb object as a methods argument.
Server returns a response to Client A Server executes message, calling private methods which rely on this member variable.
  Server returns a response to Client B


Now, imagine the above scenario describes an online calculator. Given a number, it will run complex calculations and return a result. Perhaps, these calculations take several seconds to complete. What would happen if two clients connected and ran the "calc" method on the following code within milliseconds of each other?

class FancyCalculator

	def calc(inval)
		@inval = inval
		intermediary = first_calc #may take several seconds...uses @inval
		nextval = second_calc #see above
		return nextval
	end

	def first_calc
		# do some stuff with @inval	
		...
	end

	def second_calc
		# do some stuff with @inval	
		...
	end
end


DRb will let you do this. How can we avoid this kind of contention in our distributed code? We need to make the code thread safe. There are two obvious ways to do that in this (again, contrived) example. The first, and most sensible way is to remove the dependence on member variables and just pass parameters and return values to and from the various methods in the class. The nature of the example lends itself to this solution. The second way, which would be useful in a more complex example, is to use Ruby's Mutex library to synchronize access to the member variable. Here's another contrived but illustrative example:


class FancyCalculator

	def initialize
		@mutex = Mutex.new
	end

	def calc(inval)
		@mutex.synchronize do 
			@inval = inval
			intermediary = first_calc 
			nextval = second_calc 
			return nextval
		end
	end

	...
end


The hidden cost here is that clients will have to stand in line to execute the "calc" method. In more complex code, there will be more cases where this approach makes sense.

Security

Executable, internet-exposed code is bound to draw a little attention from security conscious readers. Our examples so far have described globally accessible features with no regard for who or what has the right to execute them. Obviously, we don't want to deploy real world applications in this state.

DRb deals with the security issue via it's ACL (access control list) classes. Each DRb instance can have an ACL associated with it which can allow or deny hosts based on their IP addresses. While not a very flexible model, it can be used to keep the bad guys out. Here's an example (shamelessy stolen from the DRb included samples):

if __FILE__ == $0
  acl = ACL.new(%w(deny all
                   allow 192.168.1.*
                   allow 209.61.159.* # chadfowler.com!
                   allow localhost))

  DRb.install_acl(acl)

  DRb.start_service(nil, SongNameServer.new("/tmp/songname")
  puts DRb.uri
  DRb.thread.join
end


As you can see, an access list is created as an object of type "ACL", which takes a string describing which hosts to allow and/or deny. You can optionally supply a second argument which will change the default access level (deny or allow). This is expressed as a 0 for DENY_ALLOW or a 1 for ALLOW_DENY. The default is DENY_ALLOW.

After having created the ACL, it can be passed into the DRb service via the "#install_acl" method of the DRb module. Note: you must call this before starting the DRb service for it to function properly. When the service starts, any unauthorized clients will cause a RuntimeError to be thrown by the server containing the message "Forbidden".