FAQ 5c
Updated: 1/10/00
>>>>>> How do I use a CGI?
==========================
CGI stands for Common Gateway Interface. Its a standard for communications
between information servers and external applications. This means your browser
or web-server can talk to an application that uses CGI. We have grown a
little sloppy in our linguistics, so now any application that uses CGI is
called a CGI. Ce'la'vie.
Here is a very good web-site that describes the Common Gateway Interface:
http://hoohoo.ncsa.uiuc.edu/cgi/
CGI's are closely associated with Web-Servers. If you are not running a
web-server on your machine - THEN GO GET ONE! If your OS doesn't have one,
the try the latest from Apache.
CGI's are placed into a folder with execute permission. Most web-servers will
automatically create a folder for you called '\cgi-bin'. This is where you
dump your CGI applications.
CGI's are then accessed in two ways: via HTTP requests, and via Server Side Includes.
We are more interested in the HTTP requests, BUT it's much easier to get started
with the Server Side Includes.
I recommend that you use C/C++ for writing your CGI's. Even Perl is acceptable.
Since Java is an interpreted language, it doesn't compile into neat little executables.
If you want to use Java for your CGI's, then you have to find a Servelet Server to run
with your web-server. Apache has a Servelet Server module.
WARNING! I have yet to play with Servelets! I cannot help you until I do.
All my examples below are done with C/C++.
:::::: CGI's and Server Side Includes
=====================================
Hunt around in your web-server documentation for the phrase: 'Server Side Include'
or the key letters 'SSI'. When you get SSI's activated for your site, you will
be able to place CGI requests into your HTML code (web pages).
A good CGI to start with is the web-counter. Lets pretend we have one written
already. In your '\cgi-bin' there is an application called 'counter.exe'.
On your web-page we have a Server Side Include in the HTML. You can find it
by looking for the keyword '#exec':
<html><body>
My Web Page!<br>
<!--#exec cmd="c:\cgi-bin\counter.exe"-->
Hits so far!<br>
</body></html>
When your web-server prepares this page to show to a visitor, it will replace the
server side include (SSI) line with the output from 'counter.exe'. Notice that
the SSI starts with <!-- and ends with -->. These symbols mark it as a comment,
so it would not be visible even if the line wasn't replaced.
What does the counter do? Every time you call it, the application opens a file,
reads a number, increments it, saves the file, then prints the number. So the
tenth time CGI is called, it prints the number '10'. This number replaces the
SSI on the page, so the visitor gets to see:
My Web Page!
10 Hits so far!
You can test your counter program by typing it on a command line.
From my command line I type:
c:\cgi-bin\counter.exe
The result?
11
Perfect.
You can even test your CGI through the browser:
http://www.mySite.com/cgi-bin/counter.exe
The Result?
CGI Error
The specified CGI application misbehaved by not returning a complete set of HTTP headers. The headers it did return are:
12
Whoops! What went wrong?
:::::: CGI's and HTTP requests, Fully Formed Pages
==================================================
Here is an example in C/C++ of how 'counter.exe' prints the total hits:
printf("%d",totalHits);
In order to display a fully formed web-page, your CGI needs to place some header
information into its response. Here is our modified C++ code ('\n' is a carriage return):
char msg[256];
sprintf(msg,"%d",totalHits);
printf("content-type: text/plain\n");
printf("content-length: %d\n",strlen(msg)); // count the characters in msg
printf("\n");
printf("%s",msg);
Now when you try looking at it with your browser, you get a correct result.
HTTP communications need header information. You HAVE TO HAVE a content-type.
The content-length is optional, but highly recommended.
'content-type' defines the contents of the following message. You most common
types are 'text/plain' and 'text/html'. Other resources are transmitted as
'image/gif', 'image/jpg', 'audio/mid' etcetera.
'content-length' is the number of bytes in the following message.
There is one problem with calculating content length. When C/C++ prints out the
carriage return symbol, it automatically adds a line feed symbol. This means that
your calculated length will be off by 1 for each carriage return in the content.
You need to tell the standard output to stop doing the line feed thingy. You fix that
by switching to binary mode. That way, there is no interpretaion of the output:
setmode(_fileno(stdout),O_BINARY);
You will see me do something similar with Java when calling CGIs (see below).
:::::: CGI's and HTTP requests, Testing with Forms
==================================================
Information is handed to a CGI in two ways, the command line parameters, and the
standard input. Here is an example of an HTML form that calls a CGI:
<form action="http://www.mySite.com/cgi-bin/board.exe?commands" method=post>
<input type=hidden name=Action value=Record>
<input type=text size=30 name=Name>
<input type=submit value="Send">
<textarea name=Message cols=60 rows=20></textarea>
</form>
This will create a text field that will take a string up to 30 characters long, a
button named 'Send', and a large text area. When the button is pressed, the action
listed in the form line will be performed. 'board.exe' from the cgi-bin will be
activated and handed a command line and some standard input. Let's assume that the
name 'Fred' was typed into the text field, and 'Hello World' was typed into the text area.
Here is what the command line looks like: commands
Here is what the standard input looks like: Action=Record&Name=Fred&Message=Hello+World
Your CGI will need to examine both the command line and the standard input when
deciding what to do. Here is how you find the command line:
char *query = getenv("QUERY_STRING");
The standard input is read just like any other type of input. The only trick here is
that you have to retrieve the content-length of the message before reading it:
char *num = getenv("CONTENT_LENGTH");
sscanf(num,"%d",&size); // convert string to number
msg = (char*)malloc(size); // allocate memory
for (ix=0;ix<size;ix++) msg[ix] = getc(stdin);
Now that you have the standard input, you need to interpret it. First, break it into
chunks based on the '&' symbol. This gives you a series of name=value pairs. After
you pick the name you want, you end up with a string value. Let's pick 'Hello+World',
that looks pretty funky.
In order to properly format the standard input, CGI (Common Gateway Interface) had
to sacrifice some characters. The symbols ' ', '+', '&', '=', '?' and '%' are all reserved.
When you see them in the standard input (or even the command line) they have new meanings.
Here is what you do: every time you see '%', it will be followed by two hexadecimal
digits (0123456789abcdef). These two values specificy one of the 256 ascii characters.
Replace the three symbols '%00' with the new character. When you see '+' replace it with
a blank character.
Now you can read the finished line 'Hello World'
:::::: How Applets Call CGI's
=============================
I've place an example of how to accesses CGI's from an Applet. The code 'Loader.java'
is a stripped down version of the Dragon Court CGI loader: Loader.java
Look at the function Operate(). You need to pass it a URL. This is web-address of the CGI,
with the command line arguments appended. The Dragon Court URL looks like this:
http://www.fiends.com/cgi-bin/dcdbm.exe?cfg=dcourt&act=myAction
The stuff after the '?' are command line arguments. You may notice that it has the
same format as the standard input I mentioned above. That was a choice on my part,
it's not essential.
The function operate() does 3 things. First, it opens and configures a connection.
Second it transmits the body of the command. Third it reads any response the CGI sends back.
Let's examine the configuration part first.
URLConnection con = path.openConnection();
con.setDoOutput(true);
con.setDoInput(true);
con.setUseCaches(false);
con.setRequestProperty("content-type","text/plain");
con.setRequestProperty("content-length",""+size);
setDoOutput() and setDoInput() means I'll be both talking and listening. setUseCaches(false)
means don't use any old responses for this requests. Content-type and Content-length mean
exactly the same thing here as they did for the CGI.
You will notice in the transmission section (very short) all it sends is the body of the
message. The command line arguments were incorporated into the URL.
Finally the response section. Notice that we don't even try retrieving the message if
the content-length isn't sent. I had WAY TOO MUCH TROUBLE with dragon court, because the
only way to ensure that messages were complete was to fill in the content-length field.
I've learned my lesson.
Wow, never thought I'd finish that explanation. ;)