Online Lab - Introduction to Eclipse, CVS, Tomcat, and ANT for Life Sciences Developers

Creative Commons License
This work is licensed under a Creative Commons License.

Presented 2004 at the Developing the Tools: Canadian Bioinformatics Workshop

Key Concepts

Java developers utilize several different software applications to streamline development and delivery of their applications. Diving into any bioinformatics-based software project requires prerequisite experience with these applications. The goal of this lab is to introduce you to several of the more popular applications: the Eclipse IDE for code development, CVS for code maintenance, Tomcat for deploying web-based applications, and Ant for code delivery. Furthermore, in exploring these applications, we will be deploying a web service; web services provide a platform- and programming language-independent way of delivering data and analysis.

What you will be able to do at end of this lab

  1. Download an IDE (Eclipse)
  2. Check out a software project using CVS
  3. Start a web server using Tomcat
  4. Compile and build a Java application usint Ant
  5. Deploy a web service
  6. Run a Perl or Java-based web services client that does something interesting

Introduction

Programming is about communication; writing code for humans first and machines second. In this lab, we will look at some of the modern, freely-available tools that Java developers use to communicate with one another (and themselves). It may be a frustrating experience at first as learning how to use an IDE or a versioning system like CVS can contain as many intricacies as the act of programming itself. However, experience and patience with these techniques will greatly increase your software's lifecycle. In this introduction, we will walk through the history of some of these applications, what goals they hope to address, alternative approaches, and a bit of relevant programming philosophy.

Programming philosophy

Software which does more will likely be more complicated and therefore more difficult to learn.

New software which is similar to software you already use should be easier to learn.

Integrated Development Environments (IDEs)

An IDE can be thought of as a workspace for code development; they aid software developers in writing and running their applications. IDE's first appeared in 1964 for the Basic language, this was largely in response to the shift from keypunch-based software development to development via a computer terminal. This earliest IDE only had four commands: "NEW" (to start a new program), "OLD" (to open a previously-made program), "LIST" (to display the current program), and "RUN" (to execute the program). Program modifications were made by typing in the line number and the corresponding new line of code. While this may seem archaic when compared to modern day IDEs, the advent of the IDE allowed software developers to write and execute code with relative ease.

The sophistication and diversity of modern day IDEs has prevented the exposure of these applications to most novice programmers. This is not without good reason as most instructors would rather see their students learn the basics first, and then put time into learning other things later. Instead though, exposure to software development often occurs in text editors like EMACS or VI. This is, unfortunately, the most inconvenient way to program and frequently lengthens development time and aids in enforcing the development of long, unreadable programs (primarily because it is easier to maintain fewer, longer files than 10s or 100s of shorter files, even though the latter may be more intelligible).

The sophistication and diversity of IDEs is primarily due to the existence of several IDEs per language and the large amounts of vendor-specific technology that are at the programmer's disposal. Fortunately, the skills that can be acquired in one IDE are more often than not applicable to other development environments. In this lab, we will focus on the Eclipse IDE. Eclipse is an open-source project for building IDEs. In itself, it is also a powerful Java-based development environment that aids software developers in designing, managing, traversing, debugging, and running their software systems.

Programming philosophy

Instead of that very neat and orderly procession, which doesn't happen even in the real world with buildings, software is much more like gardening.far more organic, far more malleable, and something that you have to be prepared to interact with to improve all the time.

-Andy Hunt and Dave Thomas

Eclipse

Eclipse is self-described as "a kind of universal tool platform - an open extensible IDE for anything and nothing in particular". While we will be using Eclipse in the context of CVS, Java, and Ant, Eclipse's principal role is providing architecture for creating and integrating new development tools. To clarify this, Eclipse can be though of as a large plug-in engine that allows a developer to install (plug-in) different resources for managing their content (code and resources). An example of this can be seen by looking at an Eclipse plug-in called JDepend, (for more info on JDepend, read Managing Your Dependencies with JDepend). JDepend offers a collection of tools for observing various architectural features of your software system. While this may not be everybody's cup of tea, if you are interested in integrating their analysis into your development environment, you can use Eclipse to plug-in their tool. This type of flexibility is principle to Eclipse's development; Eclipse allows you to use tools in the context of your individual preferences, with a very large limit on the number of tools that you can integrate.

Eclipse History

Eclipse originated from OTI and IBM's VisualAge IDE in 1999. IBM late in 2001 donated 40 million US to the construction of "Java-based open source software ... [that] will enable developers to use software tools from multiple suppliers together". In effect, by opening up Eclipse, IBM aimed to do for IDE's what Linux did for operating systems and Apache did for web servers. The Eclipse platform was designed to meet the following requirements:

  1. Support the construction of a variety of tools for application development
  2. Support an unrestricted set of tool providers (anyone can build Eclipse plug-ins)
  3. Support tools to manipulate existing and emerging content types (HTML, Java, etc.)
  4. Facilitate seamless integration of tools across tool provides and content types
  5. Support GUI (graphical) and non-GUI (command-line) based IDE development
  6. Run on a wide-range of operating systems
  7. Capitalize on Java for writing tools

Part of the history of Eclipse involves its challenge to Sun's community-based IDE project, NetBeans. (It is believed that the name of Eclipse is a slight against Sun). In the first quarter of 2004, Eclipse announced its independence from IBM in a move that was designed to clear the way for IBM rivals to join the Eclipse board of directors. Furthermore, this move was designed to allow Eclipse to grow into a platform that, to quote David Orme, the leader of the Eclipse project, "was something more than IBM". However, business differences involving Java standards practice and Sun's integration with its NetBeans IDE have remained.

Further information can be found by reading:

Versioning Control Systems (VCS)

Versioning control systems have been designed to allow groups of people to simultaneously work on a software system. Furthermore, they also allow individual developers to recapture information from a previous point in the software's lifecycle or to simply back-up their projects on another machine. The earliest forms of version control involved a process called "lock-modify-unlock". This involved a developer placing a lock on a software component (code or resource) so that no other developers could modify it while a change was being made. This obviously was a big step in ensuring that code was being developed in an organized fashion. However, problems arose when software developers forgot to release locks before they went on vacation, or if more than one person needed to change a software component in more than one way. Waiting was out of the question.

The introduction of the open source software further exacerbated the need for concurrent versioning; all of a sudden multiple people in different geographic areas could be working on the software components at the same time. No single developer on a popular project could be responsible for integrating hundreds of bug fixes and code contributions. Further, there was very little point in organizing everyone to work together as most open source projects have a steady flux of contributors joining and leaving. This precludes the necessity for automated systems that managed concurrent modifications and kept everyone up-to-date with the newest versions of the software components.

Programming philosophy

Version control is the art of managing changes to information. It has long been a critical tool for programmers, who typically spend their time making small changes to software and then undoing those changes the next day. But the usefulness of version control software extends far beyond the bounds of the software development world. Anywhere you can find people using computers to manage information that changes often, there is room for version control.

-Excerpt from "Version Control with Subversion"

Concurrent Versions System (CVS)

CVS emerged in the early 1990s as a network-friendly version control system that allowed developers to make concurrent revisions to software components from any Internet-accessible location in the world. This was coupled with a merging control system that notified developers of conflicts when they committed their code to CVS respositories (code bases). Basically, this acts to inform developers of when two or more people had modified the same code at the same time; it was the responsibility of the individual that created the conflicts to resolve them. CVS's advantage was that it gave open source projects an easy way to give public access to the source code and to aid in the generation of software patches. In this lab, we will be using a CVS tool that has been integrated into Eclipse to check out Java code from a software repository (if you just went "what?!?!" review the list of important CVS terms below)

Important CVS Terms (from Open Source Development with CVS)

  • Revision - A committed change in the history of a file or set of files. A revision is one "snapshot" in a constantly changing project.
  • Repository - The master copy where CVS stores a project's full revision history. Each project has exactly on repository.
  • Working copy - The copy in which you actually make changes to a project. There can be many working copies of a given project; generally each developer has their own copy.
  • Check out - To request a working copy from the repository. Your working copy reflects the state of the project as of the moment you checked it out; when you and other developers make changes, you must use commit or update to "publish" your changes and view others' changes.
  • Commit - To send changes from your working copy into the central repository. Also known as check-in.
  • Log message - A comment you attach to a revision when you commit it, describing the changes. Others can page through the log messages to get a summary of what's been going on in a project.
  • Update - To bring others' changes from the repository into your working copy and to show if your working copy has any uncommitted changes. Be careful no to confuse this with commit; they are complementary operations. Mnemonic: update brings your working copy up to date with the repository copy.
  • Conflict - The situation when two developers try to commit changes to the same region of the same file. CVS notices and points out conflicts, but the developers must resolve them.

The Cederqvist and the History of CVS

The Cederqvist is the online CVS manual (after Per Cederqvist, the original author). It can be found online at Cederqvist HOME. In addition to providing everything you could ever need to know about CVS, it also provides a brief history of CVS up to point that Jim Kingdon made CVS network-friendly:

CVS started out as a bunch of shell scripts written by Dick Grune, posted to the newsgroup comp.sources.unix in the volume 6 release of July, 1986. While no actual code from these shell scripts is present in the current version of CVS, much of the CVS conflict resolution algorithms come from them.

In April, 1989, Brian Berliner designed and coded CVS. Jeff Polk later helped Brian with the design of the CVS module and vendor branch support.

Subversion as an Alternative to CVS

CVS is widely prevalent and has extensive 3rd party software support. However, new open source initiatives have aimed to supplant CVS as the de facto versioning control system of the free software world. Notably, Subversion which is the initial product of Karl Fogel (author of the popular "Open Source Development with CVS" book), Jim Blandy, and Ben Collins-Sussman. The improvements that subversion is designed to offer over CVS are:

  1. Directory versioning: CVS tracks only the history of files, Subversion tracks changes to whole directory trees over time.
  2. True version history: Operations such as copies and renames are not supported in CVS. Furthermore, you cannot replace an old file with a new file of the same name without it inheriting the history of the old file. Subversion corrects these problems.
  3. Atomic commits: (or Transactional commits) A collection of modifications goes into the repository as a whole or not at all.
  4. Versioned metadata: Files and directories can have associated key/value information that is versioned over time.
  5. Choice of network layers: Subversion makes it easy to implement new network access mechanisms.
  6. Consistent data handling: File differences are expressed using a binary differencing algorithm which works on both text and binary files. This fixes inefficiencies that CVS has with committing binary data.
  7. Efficient branching and tagging: Faster branching and tagging mechanisms. CVS cost is proportional to project size.
  8. Hackability: Well-defined APIs. No historical baggage.

You might be wondering why we are using CVS. Despite the existence of advanced VCSs, it remains the most widely recognized and adopted VCS in the world. It would be safe to assume that, for the time being, 99% of all projects that you will interact with in the life sciences field will be using a CVS repository to manage their code development. However, as 3rd party tools like IDE plug-ins and standalone clients begin to emerge for Subversion, it is likely to supplant CVS as the VCS of choice.

Web servers

Web servers deliver information to Internet users in the various data formats or through programming services. The basic interaction that most people are familiar with is typing in the location of a webpage (i.e. http://www.bioinformatics.ca/tools.php) and receiving the web page contents at this address. Web servers are responsible for the delivery of this data through a protocol called "HTTP" (Hypertext Transfer Protocol). When resolving this location or URL (Uniform Resource Locator) from your web browser (i.e. MS Internet Explorer, or Mozilla) there are 5 basic steps:

  1. The browser breaks the URL down into the protocol (HTTP), the server location (www.bioinformatics.ca), and the file "tools.php".
  2. The browser connects to a DNS server (Domain Name System) and determines the IP address from the server location (www.bioinformatics.ca has the IP address 137.82.44.22).
  3. The browser connects to the web server at the server location (www.bioinformatics.ca) on port 80, the default web server port.
  4. The browser uses the HTTP protocol to send a GET request to the web server to obtain the file "tools.php".
  5. The browser reads the returned HTML tags and displays the page on your screen in your Internet browser.

For more information on how web servers operate, read:

Web Server History

In 1991, the first web server was developed by CERN (Centre Europeen de Recherche Nucleaire) and introduced at info.cern.ch. By the end of 1992, there were 50 web servers located at CERN. Furthermore, at this server location, between 1991 and 1994, server load increased by a factor of 10 each year. In 1995, the most popular web server on the Internet was a public domain HTTP daemon developed by Rob McCool at the NCSA (National Center for Supercomputing Applications). After Rob left the NCSA in mid-1994, the development of the httpd daemon was being carried out by many different developers worldwide with no clear lines of organization. A small group of developers via private e-mail gathered their web server modifications together for the purpose of coordinating their changes. This became the original Apache group. The 1.0 version of the Apache web server was released at the end of December 1995. Less than a year later Apache supplanted the NCSA server as the most popular web server on the Internet (where it remains today). By the middle of June in 1999 there were ¾ million web servers worldwide. In April of 2001, there were over 24 million web servers worldwide.

Tomcat

Tomcat is the servlet container that is used in the official reference implementation for the Java Servlet and JavaServer Pages technologies. In a deployed web server environment, it is added to an Apache server to provide access to JSP technology.

A Brief Introduction to JSP

This topic will be covered in depth later in this workshop (day 5). However, let's quickly view the fundamentals of what this technology is in relationship to the development of Tomcat. JSP pages are similar to Active Server Pages (ASP) and PHP HyperText Processor (PHP) pages in that special tags are used to mark out parts of otherwise normal HTML pages that are to be handled separately. The difference is that, whereas ASP and PHP are processed by a browser plug-in, JSP is handled by a separate server process. For JSP this process runs Tomcat to take the JSP page and convert it into a Java program. From a technology perspective, it allows developers to write server-side scripts that utilize the power of the Java programming language.

For more information on JSP and JavaServlets, read:

In this lab, we will be using Tomcat to deploy a web service. A web service is deployable using the Apache Axis plug-in for Tomcat. With Axis, developers can access Java services worldwide and independent of programming language. (This is accomplished by using SOAP, Simple Object Access Protocol, which transports the data over HTTP in an XML format)

It is recommended to read the following on web services and Axis:

Ant

Ant is an operating system independent java-based make utility. Most developers have had the experience of trying to decipher the compile and build commands of complicated projects as sometimes the native Makefiles just don't compile for very annoying reasons (like extra whitespace or invisible characters). The goal of Ant is to resolve these issues by using XML to describe project commands as Tasks. Very simply chained commands can be created using Ant that take a user through setting-up their environment, compiling and building a project, to deploying a web service or running and application. Ant is also configurable through extensions to the language. These extensions are implemented using Java classes. Instead of writing shell commands as in a make utility, the configuration files are XML-based, calling out a target tree where various tasks get executed. Each task is run by an object that implements a particular Task interface. In this lab, we will use Ant to compile, build, and deploy a web service.

For more information on Ant, read:

Online Laboratory - Worked Example

Welcome to the lab. The introduction provided a lot of background information on the four key technologies that we will be exposing ourselves to over the next 1.5 hrs. We will not be reviewing too much of the information in the introduction in the lab due to the fact that we have a limited amount of time to actually try out these technologies. Please read the introduction before the lab.

Lab outline

This lab is broken into the following 10 steps:

  1. Start Eclipse
  2. Exploring Eclipse
  3. Checkout Java code using CVS inside of Eclipse
  4. Explore the fasta_service project
  5. Explore the Ant script
  6. Visit the Tomcat directory
  7. Build and deploy the web service using Ant
  8. Run the Java and then the Perl client
  9. (Optional) Development exercise
  10. Dinner

Lab Notes

  • Elements of this lab which have been added modified for the online version are denoted as

    Online lab note

  • Code appears as so
  • Code comments are in bold
  • Empty boxes are for notes in the printed version of this lab
  • File contents are outlined with a box

Let's get started on this lab.

Step 1: Start Eclipse

Follow along as we start-up Eclipse. Write down descriptive instructions for starting Eclipse (instructions will be on the board at the front). You will need this information over the duration of the course.

Location of Eclipse




Online lab note

Please download Eclipse for your operating system from Eclipse HOME. Follow the installation instructions and try starting Eclipse.

Once started, you should see the User Interface below:

Eclipse

Figure 1: Eclipse IDE. Eclipse uses perspectives to display various windowing arrangements. Perspectives are selected from either the leftmost vertical toolbar adjacent to the Navigator window or from WINDOW > OPEN PERSPECTIVE in the menu bar. Depending on which operating system you use, Eclipse can look different. This is because Eclipse uses a Java API developed by IBM for its graphical user interface; this API, called SWT, binds the native windowing system directly. (Do not worry if you don't see "fasta_service" in your Navigator window, this is the project that we will be working on later in this lab.)

Troubleshooting

If you see the message "A Java Runtime Environment (JRE) or Java Development Kit (JDK) must be available in order to run Eclipse. No Java virtual machine was found after searching the following locations: some directory", Eclipse cannot find a JRE or JDK to use. You will need to install one for your operating system from http://java.sun.com

Step 2: Exploring Eclipse

Let's explore the Eclipse user interface a bit by creating a new project. As mentioned in the introduction Eclipse is a tool for building IDEs; you can create and manage any type of project you want, including those in Java. Let's start by building an empty project. Click FILE > NEW > PROJECT and select SIMPLE and then NEXT.

New Project in Eclipse

Figure 2: Creating a new project. After selecting FILE > NEW > PROJECT, you will be able to see this pop-up window. Here you can select the types of projects you would like to create. The default choices are Java, Plug-in Development, and Simple projects. Choose SIMPLE project and click NEXT.

Give your new project a name and then click FINISH. Notice that you have created a project and it contains one file called .project. This file just manages project information (at this point you can simply leave it as is).

Now create a new folder by clicking on FILE > NEW > FOLDER. You can give this folder any name and then click FINISH. Notice a folder has appeared in your project.

The final exercise in this step is to import a file from your local file system into the folder you just created. Right-click on your folder and select IMPORT from the pop-up menu. You should see the pop-up window below.

Import Dialogue in Eclipse

Figure 3: Importing files into Eclipse. From this pop-up window you can select various projects, files, and features that you may want to import into your project. It is recommended to try opening up the wizards for each of these so that you understand what they do. We are going to be importing a file from the file system so click on FILE SYSTEM then NEXT.

It is important to observe that opening an IDE then creating, editing, and importing files is relatively straightforward but what Eclipse does not have as default are the tools necessary to help you organize and incorporate your ideas into your project. It is usually dangerous to open an IDE without a well laid plan of what you want to accomplish. Routine strategies involve getting some paper, a pen, and a nice cup of coffee and then, starting to brainstorm.

Import File Dialogue in Eclipse

Figure 4: Importing a file into Eclipse. Once you have selected your home directory in the FROM DIRECTORY textbox, you will be able to choose files from that directory to import into your Eclipse project. Here, I have selected a file named think.jpg

It is possible to manage various collections of files using Eclipse. In the next step, we will be using CVS to checkout and create a project from a repository; it is also possible to put many different types of files in version control under CVS.

STEP 3: Checkout Java code using CVS inside of Eclipse

In this lab, we will be checking out a project called fasta_service from a local CVS repository.

Description of Project: fasta_service

The project that we will be working in this lab is named fasta_service. It is a Java web service; it allows for remote calls over the SOAP (Simple Object Access Protocol). SOAP is an XML-based transmission language over HTTP that allows for language-neutral method invocation (basically it allows you to run commands from one machine on another, irregardless of the programming language that the client wants to use). The fasta_service project will allow you to remotely publish the contents of a directory that contains fasta files over the Internet. We will go through more of what it means to be a web service and expanded details of how fasta_service works as we progress through this lab.

Open the CVS Repository Exploring Perspective from the Perspective toolbar. By default, you should see the perspective below.

CVS Repository Exploring in Eclipse

Figure 5: CVS Repository Exploring Perspective. Once the location of a CVS repository is specified, this perspective acts as a repository browser. From it you are able to view the projects in your repository and check them out into your local workspace.

Right-click inside the CVS Repository window (on the left) and from the pop-up menu that appears select NEW > REPOSITORY LOCATION. The following window should appear.

Add CVS Repository in Eclipse

Figure 6: Add CVS Repository. Specify CVS connection properties to connect to a remote repository. For demonstration purposes, the picture above shows the connections that I use to connect to a local CVS server.

In the table below, type in the parameters you will require into connect to the CVS server inside the classroom (this will posted on the board at the front).

Location of CVS Repository
Host  
Repository Path  
User  
Password  
Connection Type  
Port  

At this point you should be able to view the contents of the remote repository. Select the folder called fasta_service. Right-click on this folder and select CHECK OUT AS PROJECT. This will create the project in your local workspace.

Online lab note

There is no external cvs currently set-up for online lab users, download the code (available at the bottom of this document. Import it into Eclipse

CVS Perspective in Eclipse

Figure 7: CVS Perspective. When the repository is specified, you will be able to browse its contents. Here the fasta_service folder has been selected and right-clicked to display the pop-up menu. From this menu, select CHECK OUT AS PROJECT to copy this project to your local workspace. Notice that you can also browse source code and history information for individual files.

STEP 4: Explore the fasta_service project

Open the Java perspective (from the Perspective toolbar on the left) and expand the fasta_service folder. You should see the user interface below.

Fasta Service Java Perspective in Eclipse

Figure 8: Exploring the fasta_service project using the Java perspective.

There are six directories to note in this project, the table below explains each of their contents.

src/ This is where the .java files exist. Expanding on this directory you will see the package structure and Java files that comprise this Java application. The standard naming convention is to use your organizations URL when naming classes. Here the root package is ca.bioinformatics since our URL is www.bioinformatics.ca. Further in this step we look at each of the packages and files to understand how fasta_service works.
resources/ This is where the non-java files that comprise this project exists. Inside this directory there are 4 files. Two are WSDD files which are used for deploying our application to the Internet (we will look at these a bit later). The other two are a log4j configuration file and a properties configuration file. The former is used to configure the log4j properties of this application (Log4j is an industry-standard logging package). The properties configuration file contains information for customizing the fasta_service.
data/ The data folder contains sample fasta files that we will use to test our service.
filesToDeploy/ This directory is a dynamic directory that is created in your project. It is a user-defined directory that has been specified in Ant to allow us to package this application for distribution.
lib/ The lib/ folder contains all the extra application dependencies (those above and beyond the JDK). The contents of the lib/ folder are copied to the CLASSPATH when the application is executed.
perl/ Contains fasta_service client code written in Perl.

A directory that does not appear in the above table is the bin/ directory. This is the location of compiled java classes (class files). (NOTE: There are two types of files in any java application, the java files which are the text files containing Java code and the class files which are the compiled java files that are executed by the Java Virtual Machine (JVM)). If you browse the workspace directory in your terminal you will see that the bin/ directory does exist. This has been specified previously when the project was initially created; you can either have the class files with the java files, or separate. To view how this was set open PROJECT > PROPERTIES from the menu bar, you should see the window below.

Fasta Service Project Properties in Eclipse

Figure 9: Project properties. Open by selecting PROJECT > PROPERTIES from the menu bar. Notice that the source file directory has been specified as fasta_service/src and the output folder were the .class files are built to has been set as fasta_service/bin.

When adding new dependencies to your project, you can also add them in the Properties window. The fasta_service project has several jar dependencies and a dependency on the resources/ folder. For more information about how Java deals with third-party dependencies via the CLASSPATH, read http://java.sun.com/j2se/1.3/docs/tooldocs/win32/classpath.html To look at these open Properties (by select PROJECT > PROPERTIES from the menu bar). Select JAVA BUILD PATH and then click on the LIBRARIES tab. The following window should be visible.

Fasta Service Project Properties in Eclipse (Library)

Figure 10: Library properties. Each of the library dependencies has been added to the project. The jars were added by selecting ADD EXTERNAL JAR. The resources/ directory was added by selecting ADD CLASS FOLDER.

Ensure the project dependencies have been added to your project consistent with the display above (don't worry if your JDK is not 1.4.2_02, it should be at least 1.4.x).

Let's go back to the Java perspective and view the .java files and packages that make up your project. The following table defines the packages that make up the Java component of the fasta_service project.

ca.bioinformatics.fasta_service.client The client code.
ca.bioinformatics.fasta_service.client.local Contains the web service client implementation.
ca.bioinformatics.fasta_service.client.junit Test code for the client.
ca.bioinformatics.fasta_service.logging Log4j logging initialization code.
ca.bioinformatics.fasta_service.properties Read properties from resources/fasta_service.config.
ca.bioinformatics.fasta_service.server The server code.
ca.bioinformatics.fasta_service.server.junit Test code for the server.
ca.bioinformatics.fasta_service.server.remote The remote interface for the fasta service.
ca.bioinformatics.fasta_service.server.remote.impl The implementation of the remote interface for the fasta service

The fasta_service package structure. Breaking your code into discrete, related packages is part of good software architecture. Here there are two discrete components of the fasta_service project: the client and the server code. Also, the addition of test cases using the JUnit testing framework allows for rapid testing and migration of changes when developing your Java projects.

It may be surprising that there are less .java files in this project than there are packages. The construction of clear packages allows developers to easily navigate and extract pieces of functionality in a large project. Good package naming is an essential part of creating a successful project.

Lastly, let's look at the .java files that exist in this project. The following list describes each file. Each file is also commented (a recommended practice) so that you can open it and view what it does. While reading the file descriptions, open and look at each corresponding file in your Java perspective. Do not worry if you don't understand all the syntax. The main goal is to understand what functionality each file brings to the project.

Java files in fasta_service project

ca.bioinformatics.fasta_service.client.junit.ClientTest.java
The ClientTest file includes test code for testing the client. You can run this test case from the menu by selecting RUN > RUN and selecting the JUnit configuration. Details on how this is done are mentioned in Step 7.
ca.bioinformatics.fasta_service.client.local.FastaServiceAxisClient.java
As mentioned previously, the fasta_service application allows you to use any Internet connection to access a directory containing fasta files located on a server machine. This data interchange is facilitated by using a XML-based transport protocol called SOAP. This file specifies how a client can run routines and receive data from a server that is running the fasta_service over this protocol. Notice that the constructor only requires one parameter, the location of the web service (i.e. http://localhost:8080/axis/services/FastaServiceImpl). The decoding/encoding of the SOAP data is facilitated by a Java API called Axis. We will look at this API in more detail when we deploy fasta_service on a Tomcat server in Step 7.
ca.bioinformatics.fasta_service.logging.Logging.java
This class initializes log4j logging in the fasta_service project. It loads the log4j configuration (the levels, layouts, and appenders) from the resources/log4j.fasta_service.properties file.


A Short Introduction to Log4j

Most developers will have the experience of inserting print statements into their code while debugging it; either to output specific variables or trace code execution through different branches. Log4j is an API developed by several authors under the Apache Software foundation for logging and tracing code execution. It provides developers with extensive control over log statements embedded in their Java projects. Developers can define specific levels of logging, direct the output to many different formats, view configurable trace information, and control the logging of individual classes and packages. For more information, visit: http://logging.apache.org/log4j/docs/

Poor example of exception logging (where e is an instance of java.lang.Exception):

e.printStackTrace(); // prints the exception stack to standard error

Log4j example of exception logging:

log.warn(e); // outputs the exception stack to the configured log file (this could be stdout)

Using log4j results in crisp output messages like:

2004-04-06 16:56:51,627 [main] INFO ca.bioinformatics.fasta_service.properties.PropertyLoader - KEY: ca.bioinformatics.fasta_service.FastaDirectory VALUE: /home/smontgom/workspace/fasta_service/data


ca.bioinformatics.fasta_service.properties.PropertyLoader.java
There are lots of ways of loading data into an application at runtime (data that is populated during the execution of the application). Java includes a Properties class that represents a persistent set of data. These properties can be loaded or saved from a stream. This class loads in custom properties from the resources/fasta_service.config file during runtime. This class also includes "getter" functions for obtaining properties loaded from this file. Our application uses these properties to set the fasta directory that is published by the fasta_Service web service and to specify the location this web service on the Internet. Open up the resources/fasta_service.config file to see examples of how these properties are set.
ca.bioinformatics.fasta_service.server.remote.FastaService.java
This is the remote interface for the fasta_service web service. It defines the functions that are deployed as a web service, their input parameters, and return values. This interface does not necessarily have to be included in this software but since our web service implementation needs to call these functions and our client needs a matching set of functions (that represent what it can call on the server), it is a good idea to write an interface. This interface ensures that future changes to this web service are also propagated in the client code.
ca.bioinformatics.fasta_service.server.remote.impl.FastaServiceImpl.java

This is the implementation of the FastaService interface. It handles all the business code for reading and sending fasta file information. It contains two nested filename filter classes which restrict publishing only to those files that have a .fa or .fasta extension.

This class is the one that we will later publish as a web service. Any Java client with a copy of the FastaService interface can make calls to this class over the Internet (see the figure below)

Web service interactions for FastaService

Figure 11: Client-server class model. This diagram gives a general overview of how the Java client code accesses the server code through the Internet.

ca.bioinformatics.fasta_service.server.remote.junit.ServiceTest.java
This ServiceTest file includes test code for testing the service implementation (i.e. FastaServiceImpl). This can be run at any time to ensure that your server is executing the way it is intended to. You can run this test case from the menu by selecting RUN > RUN and selecting the JUnit configuration.

Now that we are familiar with the packages and files that make up the fasta_service project, let's try to run the ServiceTest and verify that it works. The ServiceTest test case class, as mentioned above, acts to test the server-side implementation of our fasta_service web service; it requires no web server or Internet connection. To run this test select RUN > RUN from the menu bar. Select JUnit from the CONFIGURATIONS tree. Fill out the TEST tabbed pane exactly as shown in the figure below.

Run ServerTest in Eclipse

Figure 12: Configuration for the JUnit server test. In this window, the JUnit Configuration has been selected and TEST tabbed pane has been opened. The test has been named ServiceTest for the fasta_service project. The full name of the test class that we are running is ca.bioinformatics.fasta_service.server.remote.junit.ServiceTest.

Once you have entered in the JUnit information, hit RUN. This test will interrogate the fasta files in your test/ directory and output the header and sequence information (log4j has directed the output to your standard output window).



A Short Introduction to JUnit

JUnit is a unit-testing framework for Java. It allows developers to do regression testing on their software systems during their software's lifecycle. It is essentially all about creating test cases that can verify key pieces of the systems functionality without requiring the user to rerun the whole software system.

A popular form of programming called "extreme programming" centers on unit-testing. In "extreme programming" examples, developers create their test code before they create any application code. This is to ensure that the methodology of the system is well thought out (and software development's goal is clearly defined to meet the expectations of the unit test).

Unit testing is an extremely good practice for programmers and JUnit makes it more than easy to integrate tests into your projects (you have already noticed that the JUnit framework is integrated into Eclipse). With a comprehensive testing framework, developers can rapidly verify all the components of your system meet expectations.

And remember, for those that become unit-testing aficionados, that "green is good".

More information on JUnit can be found at http://www.junit.org



In the next step, we will be looking at Ant for automating build and test procedures. Ant allows users to specify complicated build and execution environments that are machine-independent (i.e. they do not require that everyone is familiar with the same IDE, or know how the author(s) designed the project, or have the same build-specific dependencies installed).

STEP 5: Explore the Ant script

As mention in the introduction, Ant is a operating system independent Java-based make utility. It allows developers to write complicated compile and build instructions that can be easily run (by themselves or others). The default Ant file is usually named build.xml. Open the Java Perspective and scroll to the bottom of the Package Explorer. You should be able to see a file called build.xml in the fasta_service project. In this step, we will examine the contents of this file and add a new build target that enables you to run the ServerTest from Step 4.

Open the build.xml file. The contents should look like a standard XML (it is!). There are essentially two parts to this file: the specification of build and compile properties/classpaths, and the target definitions. There is a least one targets in an Ant script; targets are the make steps that Ant executes. Below is the first part of the Ant script, let's explore it in more detail (explanatory test is in bold).



<project name="fasta_service" basedir="." default="build">
The project name and the default Ant target are specified at the top of the build script
    <!-- ========================= PROPERTIES ============================= -->
    <property environment="env" />
Environment variables are imported from the system
    <property file="buildPersonal.properties"/>
Environment variables are imported from the file buildPersonal.properties if it exists
    <property name="src" location="src"/>
    <property name="lib" location="lib"/>
    <property name="resources" location="resources"/>
    <property name="classes" location="bin"/>
    <property name="filesToDeploy" value="filesToDeploy" />
The directory structure of the project is specified as properties (if it should change in the future, you only have to change it here
    <property name="fasta_serviceZipFile" value="fasta_service.zip"/>
    <property name="fasta_serviceTarFile" value="fasta_service.tar"/>
    <property name="fasta_serviceJarFile" value="fasta_service.jar"/>
Various distribution files that we will create during a build
    <!-- ========================= CLASSPATH ============================= -->
    <!-- system classpath lower priority to anything defined in this file -->
    <property name="build.sysclasspath" value="last"/>
Add CLASSPATH information from the system to the end of the project CLASSPATH
    <path id="project.class.path">
Set the CLASSPATH dependencies for the project (include all the jars in lib/ and the files in the resources/ folder
        <fileset dir="${lib}">
            <include name="**/*.jar"/>
        </fileset>
        <fileset dir="${resources}">
            <include name="**/*"/>
        </fileset>
    </path>

    <path id="axis.fasta_service.class.path">
Set the CLASSPATH dependencies for the deployed web service (include all the jars in the axis lib/ folder and the files in the axis classes folder - we will look at this more in Step 6 and 7)
         <fileset dir="${env.CATALINA_HOME}/webapps/axis/WEB-INF/lib">
             <include name="**/*.jar"/>
         </fileset>
         <fileset dir="${env.CATALINA_HOME}/webapps/axis/WEB-INF/classes">
             <include name="**/fasta_service.xml"/>
         </fileset>
    </path>


The first part of the script set-up the environment properties for our Java project. The second part of the script specifies the individual targets that a user can invoke from Ant (and the target dependencies). Below is the second part of the Ant script (explanatory test is in bold). The "deploy" and "undeploy" targets have been excluded (they will be discussed in Step 7).



    <!-- =========================== TASKS =============================== -->

    <target name="build" depends="init,compile,jar">
    </target>
This is the build target. Nothing is specified in this target but its dependency targets ensure that the init, compile, and jar tasks are run whenever the build target is invoked.
    <target name="init">
        <mkdir dir="${filesToDeploy}"/>
    </target>
Create the filesToDeploy/ directory. This is where the distribution files will be written to (i.e. the fasta_service jar, and zip files)
    <target name="compile">
        <mkdir dir="${classes}"/>
        <!-- compile java source files -->
        <javac srcdir="${src}" destdir="${classes}" debug="${COMPILE_DEBUG_FLAG}">
            <classpath refid="project.class.path"/>
        </javac>
    </target>
Compile the java code using javac and the project CLASSPATH (as specified earlier in this script). Write the class files to the classes directory (in our case the bin/ directory)
    <target name="clean">
        <!-- clean out directories from previous builds -->
        <delete includeEmptyDirs="true" failonerror="false">
            <fileset dir="${filesToDeploy}"/>
        </delete>
        <!-- clean up the classes folder -->
        <delete includeEmptyDirs="true" failonerror="false">
            <fileset dir="${classes}"/>
        </delete>
    </target>
Clean the class directory and the filesToDeploy/ directory
	<target name="jar">
Create the distribution files
        <jar jarfile="${filesToDeploy}/${fasta_serviceJarFile}">
                <fileset dir="${classes}" includes="**/bioinformatics/**" excludes="**/junit/**"/>
                <fileset dir="resources" includes="log4j.fasta_service.properties"/>
                <fileset dir="resources" includes="fasta_service.config"/>
        </jar>
Create the fasta_service jar file
        <zip zipfile="${filesToDeploy}/${fasta_serviceZipFile}">
                <fileset dir="${filesToDeploy}" includes="${fasta_serviceJarFile}"/>
                <fileset dir="${resources}" includes="log4j.fasta_service.properties"/>
                <fileset dir="${resources}" includes="fasta_service.config"/>
                <fileset dir="${lib}">
                    <include name="**/*.jar"/>
                </fileset>
        </zip>
Create the fasta_service zip file
        <tar destfile="${filesToDeploy}/${fasta_serviceTarFile}">
                <tarfileset dir="${filesToDeploy}" includes="${fasta_serviceJarFile}"/>
                <tarfileset dir="${resources}" includes="log4j.fasta_service.properties"/>
                <tarfileset dir="${resources}" includes="fasta_service.config"/>
                <tarfileset dir="${lib}">
                    <include name="**/*.jar"/>
                </tarfileset>
        </tar>
Create the fasta_service tar file
        <gzip zipfile="${filesToDeploy}/${fasta_serviceTarFile}.gz" 
		src="${filesToDeploy}/${fasta_serviceTarFile}"/>
GZIP the fasta_service tar file
	</target>


Now that we have skimmed the structure of the build.xml Ant script in the fasta_service project, try running the "build" target from this script in Eclipse; right-click on build.xml in the Package explorer and select RUN AS ANT from the pop-up menu. The following window should appear.

Running Ant in Eclipse

Figure 13: Run as Ant window. The targets that are present in the build.xml file are displayed in the TARGETS tabbed pane. The build target can be run by ensuring that it is selected (checked) and RUN is pressed.

As mentioned in the introduction, more information on Ant's integration in Eclipse can be found at http://www.onjava.com/onjava/2004/02/04/AntEclipse.pdf.

TRY IT OUT

At this point, try writing an Ant target that runs the ServiceTest test case from the previous step. You will need to use either Google, the Ant manual at http://ant.apache.org/manual/, or refer to Writing a Simple Buildfile. The latter resource contains information on how an Ant file is made and will help you in creating a target for the ServiceTest test case. Spend no more than 10 minutes, you can both write it and test it in the build.xml file or you can describe what tasks you would use in your target below.

How to build a test target for the ServiceTest test case in Ant
Target Name:
Tasks required:




Note: If you decided to use the JUnit task, describe what you would have to do to incorporate an optional task into your Ant file. You may need to use the Ant manual or Google to find examples of how this is done.

In this step, we have examined an Ant script for the fasta_service project. By distributing Ant scripts with your projects, developers can easily build and test the associated project code. Furthermore, Ant targets allow users to run complicated tasks that may be application specific. In the next two steps, we will look at how we can deploy our fasta_service project to a web server so that it is accessible over the Internet (we will do this using Ant).

STEP 6: Visit the Tomcat directory

As mentioned, the fasta_service project is a web service that is accessible over the Internet using a protocol called SOAP. Deploying our web service over the Internet requires that we use an application server; in this case we will be using Tomcat.

Essential Definitions

  • Web service - A web service facilitates application communication over the Internet using the SOAP transport protocol. Web services are designed to be language-independent.
  • SOAP - (from the draft W3C specification) SOAP is a lightweight protocol for exchange of information in a decentralized, distributed environment. It is an XML-based protocol that consists of three parts: an envelope that defines a framework for describing what is in a message and how to process it, a set of encoding rules for expressing instances of application-defined datatypes, and a convention for representing remote procedure calls and responses.

In the first row of the table below, write the location of the Tomcat directory on your workstation. If it is not installed, I suggest installing the latest 4.x version of Tomcat from http://jakarta.apache.org/tomcat/

Tomcat directory (CATALINA_HOME)  
Executable directory $CATALINA_HOME/bin/
To start Tomcat sh $CATALINA_HOME/bin/startup.sh
To stop Tomcat sh $CATALINA_HOME/bin/shutdown.sh

Tomcat location. CATALINA_HOME is the name of the environment variable that describes the root directory of Tomcat on your workstation. Starting and stopping Tomcat is as easy as running one command.

Verify that CATALINA_HOME is a valid environment variable on your system by typing echo $CATALINA_HOME. If you do not see a directory after running this command and instead see an empty line, open your .bashrc file located in your home directory and write the following line (replacing somedir for the directory of Tomcat):

export CATALINA_HOME=somedir

To deploy our web service, we also require the Apache Axis SOAP engine plug-in for Tomcat. Apache Axis should be already installed into your Tomcat server. You can read about how this installation was done at http://ws.apache.org/axis/java/install.html



Introduction to Axis

(from the Axis User's Guide)

Axis is essentially a SOAP engine -- a framework for constructing SOAP processors such as clients, servers, gateways, etc. The current version of Axis is written in Java, but a C++ implementation of the client side of Axis is being developed.

But Axis isn't just a SOAP engine -- it also includes:

  • a simple stand-alone server,
  • a server which plugs into servlet engines such as Tomcat,
  • extensive support for the Web Service Description Language (WSDL),
  • emitter tooling that generates Java classes from WSDL.
  • some sample programs, and
  • a tool for monitoring TCP/IP packets.

Axis is the third generation of Apache SOAP (which began at IBM as "SOAP4J"). In late 2000, the committers of Apache SOAP v2 began discussing how to make the engine much more flexible, configurable, and able to handle both SOAP and the upcoming XML Protocol specification from the W3C.

After a little while, it became clear that a ground-up rearchitecture was required. Several of the v2 committers proposed very similar designs, all based around configurable "chains" of message "handlers" which would implement small bits of functionality in a very flexible and composable manner.

After months of continued discussion and coding effort in this direction, Axis now delivers the following key features:

  • Speed. Axis uses SAX (event-based) parsing to acheive significantly greater speed than earlier versions of Apache SOAP.
  • Flexibility. The Axis architecture gives the developer complete freedom to insert extensions into the engine for custom header processing, system management, or anything else you can imagine.
  • Stability. Axis defines a set of published interfaces which change relatively slowly compared to the rest of Axis.
  • Component-oriented deployment. You can easily define reusable networks of Handlers to implement common patterns of processing for your applications, or to distribute to partners.
  • Transport framework. We have a clean and simple abstraction for designing transports (i.e., senders and listeners for SOAP over various protocols such as SMTP, FTP, message-oriented middleware, etc), and the core of the engine is completely transport-independent.
  • WSDL support. Axis supports the Web Service Description Language, version 1.1, which allows you to easily build stubs to access remote services, and also to automatically export machine-readable descriptions of your deployed services from Axis.



In the next step, we will look at how to deploy fasta_service to Tomcat/Axis using our Ant script.

STEP 7: Build and deploy the web service using Ant

In this step, we will start the Tomcat service and deploy the fasta_service code to Tomcat/Axis using Ant. This will allow anyone to access fasta files from your computer from anywhere on the Internet (in Step 8, we will test this).

Start the Tomcat server by running the start command from Step 7. You should be able to open your web browser and verify that the Tomcat server is running by visiting http://localhost:8080. Furthermore, you can verify what web services are running via Apache axis by visiting http://localhost:8080/axis (see Figure 14); here you will see various scripts that verify which web services are deployed and that validate the Axis configuration. Click on VALIDATE in your Internet browser and verify that all the needed components are found. Also, try clicking on VIEW (the list of deployed web services), you should see two: AdminService, and Version. These services are installed by default and allow developers to administer their Axis configuration.

Axis Configuration in Mozilla

Figure 14: Apache-AXIS Configuration page in Mozilla. From this page you can validate the Axis configuration in Tomcat and view the list of deployed web services.

Let's now deploy the fasta_service web service using our Ant script and then verify that this has been deployed properly. Select build.xml from the Package Explorer in Eclipse. Right-click and select RUN AS ANT from the pop-up menu. Select the deploy target from the TARGETS tabbed pane and click on RUN. Verify that the FastaServiceImpl class has been deployed as a web service by visiting http://localhost:8080/axis and clicking on the VIEW hyperlink.

FastaServiceImpl has now been deployed and is accessible over the Internet (as long as Tomcat is running and you are not behind a firewall or NAT point). Ant makes it easy to deploy and undeploy web services to Tomcat/Axis. Let's now look at the Ant deploy and undeploy targets in more detail to see how this is done.



    <target name="deploy" depends="build,jar,init-ws">
The deploy target for Ant. Deploys fasta_service to Tomcat/Axis
        <copy file="${filesToDeploy}/${fasta_serviceJarFile}"
              todir="${env.CATALINA_HOME}/webapps/axis/WEB-INF/lib" />
Copies the fasta_service.jar file to the Axis web applications lib/ folder
        <copy todir="${env.CATALINA_HOME}/webapps/axis/WEB-INF/lib">
                    <fileset dir="${lib}">
                        <include name="**.jar"/>
                    </fileset>
        </copy>
Copies the fasta_service jar dependencies to the web applications lib/ folder
        <copy file="${resources}/log4j.fasta_service.properties"
              todir="${env.CATALINA_HOME}/webapps/axis/WEB-INF/classes" />
        <copy file="${resources}/fasta_service.config"
              todir="${env.CATALINA_HOME}/webapps/axis/WEB-INF/classes" />
Copies the resource files to the web applications class folder (they are now part of the default CLASSPATH)
        <java classname="org.apache.axis.client.AdminClient" fork="true">
              <classpath refid="axis.fasta_service.class.path"/>
              <arg file="${resources}/deploy-fasta_service.wsdd"/>
        </java>
Run the AdminClient with the fasta_service's deploy WSDD file (discussed in the next section)
    </target>

    <target name="undeploy" depends="init-ws">
The deploy target for Ant. Deploys fasta_service to Tomcat/Axis
        <java classname="org.apache.axis.client.AdminClient"
            fork="yes">
              <classpath refid="axis.fasta_service.class.path"/>
              <arg file="${resources}/undeploy-fasta_service.wsdd"/>
        </java>
Run the AdminClient with the fasta_service's undeploy WSDD file (discussed in the next section)
    </target>


Note that we have ignored discussing the "init-ws" target; this target is responsible for verifying that a Tomcat server exists (looks for the CATALINA_HOME environment variable which we set earlier).

The deployment tasks of our fasta_service project reduce simply to copying the required files to the appropriate Axis directory and running the AdminClient with our WSDD (WSDD: Web Service Deployment Descriptor) deployment file. AdminClient is just the application for interpreting and processing WSDD files. A WSDD file contains a bunch of things you want to "deploy" into Axis - i.e. make available to the Axis engine. Let's examine the content of fasta_service's WSDD deployment file located in the resources/ folder. Open this file in Eclipse by clicking on it in the Package Explorer from the Java perspective.



The deployment WSDD file
<deployment xmlns="http://xml.apache.org/axis/wsdd/"

xmlns:java="http://xml.apache.org/axis/wsdd/providers/java">

 <service name="FastaServiceImpl" provider="java:RPC">
The name of the service we are going to deploy
  <parameter name="className" value="ca.bioinformatics.fasta_service.server.remote.impl.FastaServiceImpl"/>
The name of the class that represents this service (remember that this is our service implementation class).
  <parameter name="allowedMethods" value="*"/>
Deploy all public methods in this class
 </service>

</deployment>


The WSDD deployment file is pretty straightforward. Here a web service name and the associated class have been specified. By copying this class and the dependencies for this class to the Axis directory in your Tomcat server and running AdminClient, this class's public methods are accessible to clients over the Internet.

If the syntax of the deployment file was straight-forward, let's look at the WSDD file used for undeploying the web service.



The undeployment WSDD file
<undeployment xmlns="http://xml.apache.org/axis/wsdd/">

  <service name="FastaServiceImpl"/>
The name of the service to undeploy
</undeployment>


Not much to it. By running this WSDD file with AdminClient the fasta_service project is undeployed as a web service. For more information on WSDD conventions, read the Apache Axis User's Guide at http://ws.apache.org/axis/java/user-guide.html.

In this step, we have started our fasta_service project web service over Tomcat/Axis. The methods specified by the class FastaServiceImpl are now available to other developers over the Internet. In the next step, we will look at how to run this service from a remote location.

STEP 8: Run the Java and then the Perl client

One of the large benefits of deploying code as a web service over Tomcat/Axis is that it is accessible to developers independent of their choice of development language. In this step, we will be running the fasta_service web service using both a Java and a Perl client. Both clients require the decoding/encoding of SOAP data via a Soap Engine. As mentioned previously, Apache Axis is the Java-based SOAP engine. The Perl SOAP Engine that we will use is called SOAP::Lite (more information on SOAP::Lite can be found at http://www.soaplite.com)

Running the client in Java

There are only two classes that are required to use the fasta_service web service: ca.bioinformatics.fasta_service.client.local.FastaServiceAxisClient, and ca.bioinformatics.fasta_service.server.remote.FastaService. Review the contents of these files either in Eclipse or from Appendix II. There is a lot of Axis specific syntax in the FastaServiceAxisClient; learning this is beyond the scope of this lab but it is pretty generic code that enables an application developer to invoke methods and interpret data from our web service (once you start writing this service code it becomes routine very quickly). The second class FastaService is just the interface for our server implementation, this is not necessarily needed to run our web service, I have included so that changes to the web service methods can be easily migrated to both the client (FastaServiceAxisClient) and server (FastaServiceImpl) classes.

You can test out the web service by running the JUnit test case called ca.bioinformatics.fasta_service.client.junit.ClientTest. If you are unsure of how to do this, review how we ran the ServiceTest test case in Step 4. You should see successful results like those in the figure below.

Running the Java Client in Eclipse

Figure 15: Running the ClientTest. If no errors occur, the Console window should show several log4j lines describing the contents of one of the files on the server. The ClientTest test case is run the in exactly the same way as the ServiceTest from Step 4.

You may be wondering how the client found the remote web service. The location of the web service has been specified in your resources/fasta_service.config file as ca.bioinformatics.fasta_service.URI. You can change localhost to point to other locations in the classroom and view the fasta files on these computers. Try it out.

Running the client in Perl

The SOAP engine for Perl is called SOAP::Lite. Using SOAP::Lite we can access the methods published by our web service despite the fact that they were written in Java. Try running the Perl script called runClient.pl from the console (it is located in the perl/ directory of the fasta_service project). You should see results to those displayed in the figure below.

Running the Perl Client in the Console

Figure 16: The Perl client being run in the console. This Perl script has used SOAP::Lite to access our Java web service.

Let's briefly look at the core parts of the runClient.pl script and see how it accesses our Tomcat/Axis service using SOAP.



#!/usr/bin/perl

use strict;
use SOAP::Lite;
Import the SOAP::Lite Perl library
################### CONNECT TO FASTA SERVICE RUNNING ON LOCALHOST

my $mga = SOAP::Lite
        -> uri('FastaServiceImpl')
        -> proxy ('http://localhost:8080/axis/services/');
Connect to the fasta_service running on localhost
################### GET FILENAMES

my $response = $mga->getFastaFilenames(); ##Get the fasta filenames
Get the fasta filenames. Note that these are the methods that we exported from our web service.
my $a_filename; ##Holds the first file that is returned

if ($response->fault) {
##If error, die with response
Fail if the method could not be invoked
  die $response->faultstring;
} else {
##Successful
If successful, get the results from the response
  my @result = @{$response->result};
  foreach my $result (@result) {
     print "Filename: " .  $result . "\n";
     ##Print each filename

     unless (defined $a_filename) { $a_filename = $result; }
     ##Grab the first file to be returned
     ##we will get the fasta header and sequence from this file
  }
}
Look at the complete runClient.pl script in Appendix II


In this step, we looked at how web services can be run over Tomcat/Axis independent of the development language. This is a powerful communication tool for developers as service implementations are maintained by the service developers. Furthermore, these services are accessible to a broad range of developers over the SOAP protocol. The last step in this lab is a problem that has been designed to get you working with the debugger in Eclipse and understanding in more detail the concepts that have been presented here. It is optional but highly recommended.

STEP 9: (Optional) Development exercise

In this lab you have been exposed to many different technologies that Java life sciences developers use to make their services and data available to a wide audience. In this step, we will be developing on the fasta_service web service and exploring in more detail the debugging capabilities of Eclipse. This step is optional but highly recommended as it will solidify your understanding of the concepts that have been presented. Do not expect to finish this in less than an hour if you have not had experience with these technologies. It is recommended that you look at it now, think about it, and then revisit this a week or two after the workshop has ended.

Problem

The fasta_service web service has been designed for simplicity however a glaring architectural flaw is that it separates the concept of a fasta header from the associated sequence; these pieces of information are returned in two separate calls. Create a class called Fasta that has the following methods (in addition to the constructor):

  • public String getFastaHeader()
  • public void setFastaHeader(String fasta_header)
  • public String getSequence()
  • public void setSequence(String sequence)
Modify the implementation of the fasta_service web service to use the routine:
  • public Fasta getFastaData(String filename)
Instead of the two routines:
  • public String getSequenceForFilename(String filename)
  • public String getHeaderForFilename(String filename)

Hints: You will need to modify the deploy WSDD file to include the new return type. The FastaServiceAxisClass will also require you to register the new return type. Give it a shot and feel free to e-mail me for the answer.

Step 10: Dinner

Congratulations.

Interpreting the data

  • During this lab we have walked quickly through some major concepts, I encourage you to revisit the concepts of Eclipse, Ant, CVS, and Tomcat in your own work environment. Furthermore, we have demonstrated a language-independent way of making data and services available to developers over the Internet using Axis, SOAP, and SOAP::Lite.
  • Not every tool is the right tool for the job though; hopefully your toolbox is a bit expanded now.

Resources

Software

More information on web services in life sciences


top Canada's Michael Smith Genome Sciences Centre | Genetics Graduate Program (UBC) | Want bioinformatics training??? | Vancouver Bioinformatics Users Group

(c) 2004 Stephen Montgomery, Canada's Michael Smith Genome Sciences Centre