beware is a small, fast product that provides global server load balancing functionality. It's in early testing stages right now.

Download the .tar.gz here.

here is the diagram mentioned below.


# beware v0.1 - polling DNS server. Also makes julienne french fries.
# 
# (c) 2002 by Alaina Hardie

---------------
Building beware
---------------

I've only built this on a Linux system. In fact, I have only built this
on one Linux system. It is a RedHat 7.1 box with current gmake, gcc, 
glibc and kernel 2.4.18. I had a problem building it with an older version
of gcc that wasn't making static ints truly static, so I upgraded to the
latest GCC and everything was hunky dory.

Basically, to build beware, you need to go to the directory where the 
source is located and type:

	make

Really, it's that easy. 

	NOTE NOTE NOTE NOTE

	Please note that beware requires the http-tiny library. I built
	it with version 1.2, which was current as of the date that I 
	originally wrote the program. (Week of March 04, 2002.)

	The http-tiny library is available at:
	http://www.demailly.com/~dl/wwwtools.html
	
	I've included a copy of the linux-constructed static library
	with this distribution. It's libhttp.a. Make sure to put it in
	a place where ld can find it -- I have a "-L." flag in the 
	Makefile, so if you put it in the same directory as the Makefile
	you should be fine.


------------------
Configuring beware
------------------

The config file is specified in the CONFIGFILE directive in beware.h. 
By default, this file is /etc/beware.conf. You can change it if you 
like, but why would you do that?

The config file accepts comments. A # in the first column means that the
line is a comment. There isn't a lot of error checking in the config file
format yet, so don't do anything wacky, okay?

Each line in the config file represents a single item. The format is:

testtype,timeout,hostname,ipaddress,testitem,primaryfailover

"testtype" is currently one of two values:
	"http" - The test on this line will do an HTTP HEAD on the 
		URL specified
	"tcp" - The test will do a connect() to the IP:Port specified

"timeout" is the timeout in milliseconds

"hostname" is the FQDN of the host we'll be giving IP addresses for.

"ipaddress" is the IP address we will return if this item is up and running

"testitem" is the field where we specify the item to test. Its format is
dependent upon the value of "testtype":

	testtype:	testitem 	testitem example
	http		URL		http://www.cavesofice.org
	tcp		IP:Port		216.220.40.210:80

"primaryfailover" is either "primary" or "failover." Basically, this
lets you create an active/passive scenario where a certain group of
servers (the "primary" group) will respond unless all of the servers 
in that group are down, at which point the servers in the "failover"
group are down. So, if something is in the "failover" group, the only
time its address will be returned is if all of the servers in the
"primary" group are unavailable.

See the "understanding beware" section below for an example of a config
file.
--------------
Running beware
--------------

beware consists of two daemons: 
	bpolld - reads the config file and tests the servers
	bdnsd - respond to DNS queries

They should be started in this order:

	bpolld &
	bdnsd &


--------------------
Understanding beware
--------------------

Take a look at the "bewarediagram.jpg" network diagram included with the
distribution. Please note the cheesy Visio icons and be impressed. Now
let's discuss the make-believe environment.

I have developed a web application (www.somesite.com) that provides a
news source to its users. People go to the web site and read the latest
news, which I slurp from various news agencies like Reuters, CNN, BBC and
The Onion. I store this information in a database. When a user wants to
read an article, they click on a link, the web server reads the article
from the database, and through the magic of technology the article is
displayed in the user's browser. Now she, too, knows that whatzizname
on Ally McBeal is in rehab again.

In this example, I have located my primary web infrastructure at a
data centre in Toronto, ON Canada, which is convenient because it's
where I live and it has great sushi restaurants. The database cluster
that supplies all of the articles is in the same data centre as the
web servers. I want to distribute the load across lots of web servers
fairly equally, so that all three production web servers are handling
their share of the load.

I have also located two web servers in a cheap data centre in San Diego,
CA, USA. These web servers do nothing but spit back a page that says,
"Our site is down due to a serious disaster. Please check back later."

In all cases, the web servers are considered to be up as long as they
are returning a valid page. My config file might look like this:

# start of config file
#
# the first three servers are production servers in Toronto
#
http,1000,www.somesite.com.,10.1.2.10,http://pw1.somesite.com,primary 
http,1000,www.somesite.com.,10.1.2.31,http://pw2.somesite.com,primary 
http,1000,www.somesite.com.,10.1.2.58,http://pw3.somesite.com,primary 
#
# the next two servers are failover servers with static content,
# located in San Diego
#
http,1000,www.somesite.com.,192.168.1.5,http://fail1.somesite.com,failover 
http,1000,www.somesite.com.,192.168.1.43,http://fail2.somesite.com,failover 
# 
# end of config file

Now here is what happens. The user (cleverly shown in a home office on
the left of the diagram) types "www.somesite.com" into her browser. Her
DNS resolver at her ISP hits somesite.com's DNS server, which delegates
responsibility for www.somesite.com to the beware DNS server on the right
of the diagram. The beware server sees the query, looks at its table,
and sees that all three servers at the primary site are responding to
beware's test polls. Therefore, the first DNS query will be answered with
the IP address 10.1.2.10. The second will be answered with 10.1.2.31.
The third will be answered with 10.1.2.58. The fourth will start again
at 10.1.2.10, the fifth will be 10.1.2.31, and so on.

Now let's say that the server called pw2.somesite.com crashes. pw1 and
pw3 are still up. As long as pw2 is not responding to beware's polls,
its IP address will never be returned.

Note that not once have we failed over to the servers in San Diego.

Okay.

So.

Due to a bozo systems admin, who unplugged a Fast Ethernet switch, all
three servers in Toronto are down.  None of the servers in Toronto are
responding to beware's polls, but both of the servers in San Diego are
up. What happens? Well, since all of the servers in the primary group
are down, we go through the failover group. The first DNS query returns
192.168.1.5, which the sharp tacks in the box will notice is the IP
address of fail1.somesite.com.  The second query returns 192.168.1.43,
the third 192.168.1.5, and so on. will never leave this failover group
until the bozo systems admin in question plugs his Ethernet switch back
in. Once one server in Toronto comes up, the servers in San Diego will
never be offered to clients again... until Bozo strikes back!

Clear as mud?

Share and Enjoy!

Alaina Hardie - alaina at cavesofice dot org
Toronto, ON
08 Mar 2002

for more information, email beware at cavesofice dot org.