HTML Form to Dynamic PDF Demo - Form 1040 Tax Return

This demo generates a PDF file dynamically on-the-fly based on the information from a HTML form and server-side calculations.

Server-Side Processing or Client-Side Processing?

Acrobat Form and FDF are not the only way to do forms -- Explore alternatives.

[ Cut out the blurb, and give me the demo. ]

So, you have to do forms using PDF, and somebody told you to read up on Acrobat(R) Forms and FDF.  That's certainly one way, but back-track a little and think again.  Do you really have to use Acrobat(R) Forms?  Is it really a requirement that users must fill out on-line forms that look exactly like printed forms?  Or do you just need a PDF file in the end containing the exact form filled out with appropriate content, but to get necessary data, you would rather use a friendlier interface than the official mean-looking form?  If your case is the latter, please read on.

The standard method of using forms with PDF is via PDF forms and FDF. If PDF-based forms fit your need, great! However, they may not be appropriate for other situations. For example, how does one deal with form fields that are derived from data in other fields based on a complex rule or on a database on the server? With Acrobat(R) forms, it is possible to perform simple calculations across form fields using JavaScript built into the forms plug-in.  This scheme is based on client-side processing. However, are you really confortable with the idea of each client machine performing form calculations based on JavaScript code embedded in Acrobat(R) forms?  Current Acrobat(R) Forms JavaScript is limited to simple arithmetic calculations (add, subtract, multiply, and divide).  Hundreds of Acrobat(R) form programmers will be reinventing square root, and power functions for loan calculations.  Imagine thousands of copies of forms with bugs in field calculations, spreading via the Internet. Simply placing a new form on your web site will not be enough. Downloaded forms may be passed around via e-mail or archived at a customer site. Recalling a form or letting customers know to get a new one would be a PR nightmare. This may be a case of forms having just enough intelligence to get into a trouble.  Of course, Forms JavaScript may become more capable with time, but this is really not the question of sophistication of JavaScript. The real question is whether form processing should be client-based or server-based.

Potential problems with the client-based processing are not a hypothetical concern, as demonstrated by a recent case involving Acrobat(R) Forms JavaScript. Though, in this case, the fault appears to be with the Adobe(R) Forms JavaScript plug-in, not user level JavaScript code embedded in the form. In many ways, however, this is actually much worse. All end-user Acrobat(R) installations will probably have to be updated for Acrobat forms to work correctly. This is hard enough for well-controlled corporate intranet. But it is nearly impossible to do if it involves random clients across the Internet. The update will require years to be 99% complete. This incident provides a strong argument for a single server-side program that performs form field calculations. Yes, there can be bugs in server-side programs, but once discovered and fixed, everything will start working correctly immediately for all users.  No exceptions.

There are other advantages for server-side processing. Placing dynamic images into PDF forms using FDF is a very complex process. With ClibPDF, as shown in the example below, placing images dynamically is nearly trivial.

Bar Code and Non-Base-14 Fonts in Forms: Standard Acrobat form fields cannot use any fonts other than the Acrobat(R) Base 14 fonts (Courier, Times-Roman, Helvetica, families, Symbol and ZapfDingbats). This prevents us from using bar code fonts (except for using a "hack", i.e., embedding as an "icon" for a button). With ClibPDF's Type-1 font embedding, any font may be used (to the extent font licenses allow), because all text including form fields are generated dynamicaly as standard text on the server side.  See our demo below in which the social security number of the tax payer is encoded as bar code in the upper-right corner.  Labelling all forms with bar code serial numbers just makes good sense, even if you are not yet set up with bar code readers.  Using bar code when going back from a printed form to database will reduce critical errors and improve productivity.  Forms may be keyed on social security number, medical record number, insurance policy number, invoice and PO numbers.  You can do that now.

Printed forms and PDF version of these forms may have inappropriate design and layout (fine print, redundant and derivative entries) which may make them especially unsuitable for interactive use via the screen. But usually, changing the form design is not an option.  This doesn't mean, however, that everyone has to be subjected to badly designed forms.  You can make it easy for users to fill in well designed forms, and still produce properly filled out forms as a PDF file.  And with all the recent web design tools, it is much easier to create helpful user input page in HTML than is possible with Acrobat(R) forms. [Fine print is possible with HTML too, if necessary.]

In many applications, therefore, you may be better off using an HTML form and generating a new PDF file dynamically on the server with all the proven and more capable technologies for performing field calculations, rather than having a user fill in an Acrobat(R) form of limited intelligence. This demo shows an example of such an application based on ClibPDF and some additional custom code. The form should be very familar to any U.S. taxpayer.

Please also note that dynamic PDF applications do NOT have to follow typical Acrobat(R) workflows -- Preparing a form template in drawing applications, write it out to a PDF file, load it into Acrobat, and define form fields, etc. For simple forms such as invoice or order forms, it is not hard at all to draw forms programattically using ClibPDF.  Even for moderately complex forms, it is quite feasible.  One example of such uses is by a government agency in Spain to generate serial-numbered forms dynamically.  Generating a form this way from scratch every time is generally very fast, probably faster than most approaches that attempt to merge existing PDF templates with new content. Saved templates generally do not buy you any speed, because the process of merging PDFs is highly complex. For a moment, you might re-examine whether you must create forms by drawing applications. Having said that, we must admit that most IRS forms are too complex to duplicate programmatically, and hence this demo. 


Technical Note:
The form in this application is based on a bitmap image at 300, 200 or 150 DPI produced from the original PDF file from the IRS web site using Ghostscript or an application on a DPS system.  The images are stored in a custom format called PDFIMG that is optimized for fast loading into PDF streams.  On this background image, we are able to draw anything and place additional images using ClibPDF. It is not based on a PDF template used as the background. The bitmap image for a form of this complexity typically has a size of about 56kbytes (ZIP-compressed) at 200 DPI, which is completely adequate for most applications (please judge yourself). The image size is only 66kbytes (CCITT G4) at 300 DPI. Of course, the simpler the forms, the smaller their image size becomes, because they compress better. This CGI consumes only about 50msec of CGI CPU time. In comparison, a simple E-mail CGI (in PERL) that we use for ClibPDF Mailing List subscription consumes 130msec. The hardware for the server is a 200MHz Pentium Pro with 128MB RAM, running FreeBSD 2.2.7-STABLE. The CGI binary is 180kbytes with ClibPDF and CGI libraries statically linked.
ADOBE and ACROBAT are registered trademarks of Adobe Systems Incorporated.
(Copyright 1999-2002, FastIO Systems, All Rights Reserved; Last modified: 2002-04-08)