Model Driven Design with Matlab
By Elliott Reitz II
Twelfth Annual International Symposium of the International Council On Systems Engineering (INCOSE), 28 July – 1 August 2002
The technical challenge of recognizing pertinent information on images of mail-pieces (such as indicia) has encouraged the integration of multiple technologies into a complex modelling environment. The modelling environment was developed to build and select the appropriate algorithms and subsystem component designs. A C/C++ library has been developed for components of the system model that are selected for product application. Prototyping has been extensively used in all of the environments. Example products developed with this approach that have been delivered to customers include: a classifier algorithm for character recognition (USPS RIP2), a Pre-sort-Label detection device (USPS APPS CFT), and a Non-Address-Attribute-System for recognition of indicia (UK Royal Mail Address Interpretation). The results have exceeded expectations for schedule, performance, and development costs.
2. Tools for Modelling
3. Development Processes
4. Development Environments
5. Configuration Management
6. Results of Application
System Development is often faced with conflicting requirements for tools, development, and integration. Especially challenging are developing strategic technologies specifically targeted at diverse product applications while still obtaining relevant product solutions in time-for-market of these products.
Figure 1. System Development Process
The fields of recognition technologies and control-laws are both particularly challenging in this area due to the off-line development needed to produce the internal parameters needed.
For example, in a control law development project aimed at an aeroplane, great efforts are expended making sure the controls are stable and effective. There are numerous individual controls (ie: flaps, rudders, etc) that are first designed independently. These obviously affect the resulting integrated control for the aircraft (that contains another layer of control). Thus numerous control designs must be integrated into the system-control design for the aircraft.
Similarly, recognition system development typically requires development of multiple recognition-subsystem components integrated into a larger recognition-system.
For example, an Indicia recognition system aimed at processing mail integrates Glyph, Character, and other low-level recognition subsystems into a system that processes 1 mailpiece at a time.
By using system modelling, a Systems Engineer can integrate the diverse development efforts into real product prototypes. This allows for pre-validation prior to investing in components of a design that end up useless. It also provides an identification of the design's components most in need of improvement.
For the past 3 years, a team at Lockheed Martin's Distribution Technology division has used this development approach. This team has generated the results herein (results section). Members of the team included these key collaborators: David Ii, Dennis Tillotson, Rosemary Paradis, Nina Kung, Larry Albertelli, Charles Call, John Cline, Eric Beezhold, Ben Tyszka, Stan Driggs, and Heather Connery.
In it's best form, this sort of system modelling is tightly coupled to system/subsystem specification and software development tasks. The system model actually becomes the detailed specification for the software design. It proves system performance before the system is actually built.
From a core technology development perspective, this approach insures the most cost-effective use of investment and IRAD resources applied to longer term business requirements. From a product development perspective, this approach provides a more streamlined development process thus reducing the time to market.
When selecting the tools to be used for System Modelling it is important to consider the business relevance of the tools as applied to the technology at hand.
For example, Matlab is an advanced tool commonly used for the development of recognition systems and for control systems. The language of the target system (ie: C or C++) is also an option commonly selected. This default selection may be tenable for simple systems. However, it often implies limited modelling and limited iteration for evaluation (without large expense due to complexity), and increased expense generating the needed training data for the subsystem components (ie: the control laws or Neural Network training data (weights)).
Other tools are available too. Some of these lack the lingual nature of Matlab that allow for integrating subsystem models into a higher level simulation. For example, SNNS is a popular tool for Neural Network design. This tool does not support the image processing needed to create training and test sets, and run multiple recognition stages within an integrated environment. Matlab does. In contrast, SNNS does provide run-time C code. Matlab provides an auto-translation utility (that suffers some limitations).
Thus it is appropriate to perform a tools survey and selection prior to selecting a primary development environment. Once a selection is made, significant investment locks in the selection.
Figure 2. Business Relevance
Model driven designs usually end up in products that are built in C or C++. Thus using a different language for prototyping implies a translation delay. The benefit is the ability to build several candidate prototypes very quickly and then select only the best one for the target product (with certainty in the performance). Sometimes this isn't necessary. However, whenever it is necessary it is very expensive trying to avoid it.
Figure 3. Language Selection
Once a modelling language or tool is selected, it remains important to apply varying degrees of formality to the modelling as appropriate. For example, Matlab provides for easy use of C++ object oriented design using functions, structures, etc. When developing complex models it is important to take advantage of high-end techniques (ie: Structures, command-parsing, etc) in order to manage the design's complexity. In contrast, simple prototypes and functions may be developed with a "quick and dirty" approach. This complexity should be a determining factor driving the selection of formality within the individual prototypes.
Figure 4. Coding Standards
A formal development process is a common mechanism to plan the resources and tasks within a project. LMDT has an SEI-5 certification indicating it is a world leader in the development of an effective software development process. Applied to recognition systems it has still been necessary to adapt these processes for the multiple design environments and short schedules implied by the projects (and the process dictates this tailoring).
The most important process adaptation is to recognise the validity of prototyping in the process, and the management of prototyping as an iterative maturation of stable design components. For example, the most basic prototype is a command entered directly into the Matlab environment, or into the MSVCPP application. More complex prototypes gain maturity and become formalised into real product-worthy designs.
Figure 5. Tailored Process
Another useful perspective on the development processes is that there exist several concurrent developments that are at various stages of maturity. Immature components are expected to have heavy use of prototyping short-cut techniques. More mature components usually replace the short cuts with more formality. In this light consider the example of hard-coding variables, use of structures, etc.
Figure 6. Formality and Roles
Within the higher level formal process model, additional process views are appropriate to consider the roles and applicability of the tasks.
For example, it is important to consider the overlapping roles of the key disciplines and the software environments applied to a continuous development process.
Figure 7. Continuous Development
Multiple environments are developed for Modelling, Test, and Integration. Each of these environments will have it's own library of legacy components, and developing components.
Figure 8. Development Environments
The three environments are connected during the test and integration phase prior to a product release. During the integration phase, proven model components are translated into the simulation environment. In the simulation environment it is verified that the runtime software is numerically identical to the original system model. Finally, a release is made to the external customer that is controlled by the companies' formal Configuration Management process (the product library).
Translation from the Modelling environment to the Simulation environment is a key challenge. Tools such as Matlab usually have auto-translation features that are not ideal (ie: runs slow, or does not run at all). Thus a significant effort is required to translate designs from the modelling environment into the Simulation environment. Of course, it is expected/defined that the Simulation Environment is identical to the product library in language and as many other factors as possible.
Note that selecting different tools can significantly alter this multi-environment view of this situation. For example, using a tool such as SNNS to train the Neural Networks and a Simulation environment for integration and test leaves open the questions of processing image data into vector-sets and so-on.
This scenario is what was done prior to the approach documented herein. It was necessary to use a preliminary build of the system design to process images into vector sets and so-on. Thus the modelling effort was always pushed onto the Integration and Test stage of the development (at great expense too). More formalisation was also needed for support tools because they weren't very flexible, and were needed to pick-up the image and data processing not reasonable within the integration environment. In that realm is where Perl and other scripting languages have become known for their rapid design cycles and easy adaptation. We also have a significant number of engineers who have developed very strong Unix skills (Unix is pretty good for moving large amounts of files and data).
Our experience using Matlab for the system modelling has also ended up demonstrating it's superior power for the peripheral support tools (ie: for image pre-processing (ie: per the system model), truthing, scoring, failure mode analysis, and so-on).
Configuration Management (CM) of each environment is independent with differing objectives, and only the product library is formally controlled.
The Modelling Environment is almost completely unconstrained. Of course reference to previous configurations and proven components from those designs will assist in making a build-able model. In fact, performance vs schedule is traded off during the integration and test phase of the development. A "lights-on" version is usually produced as a proof of concept. Sometimes this "lights-on" model is good enough but more typically the system components are iterated until the time limit is reached (assuming performance requirements are exceeded as they have been with products to data).
The CM applied to the Simulation environment is actually an informal process involving teamwork. In this environment the primary desire is to maintain a configuration that matches the Modelling Environment while the Modelling Environment remains unconstrained.
The product CM controls the Product Libraries for the target customer. This level of CM is very formal and usually does not include modelling data or even components that are not actually included in the design. Usually design updates replace the complete subsystem and thus change management is simplified.
In addition to the real products produced (sold), these synchronised libraries are the primary result of the investments made in operating development efforts according to this process. They provide an ever increasing ability to deliver stronger products in less time. And yes this directly relates to development cost, schedule, risk, and even system performance (usually limited by the development schedule too).
Several products have been developed using the approach outlined herein. These include:
1. A Character Recognition device with meaningful recognition confidence output.
This project was target at improving the USPS RIP2 product but has become a design tool in itself due to its rapid training time, good runtime performance, and easy cascade capabilities.
2. A Pre-sort Label recognition device.
This device was developed for the USPS APPS CFT. It contains a Greyscale based ROI function and a classifier.
3. A Non-Address-Attribute-System for indicia recognition.
This product is currently under development for the UK Royal Mail's Address Interpretation program. Here is a sample of a candidate system architecture built from many lower level recognition objects.
4. Deutsche Post Automatic Franking Recognition System.
This was not actually a delivered product but rather, a simulation that supported the requirements/performance analysis contained in the RFI response.
5. Image Re-sampling
Interpolation functions were evaluated and a design was selected for Implimentation in firmware. This design up-samples the images to a higher resolution allowing a legacy design to be applied to a new customer with different design constraints.
6. Image filters
Several image filters have been delivered and implemented in firmware. Applications of these filters have been used to target specular reflections, and also to target smearing artefacts on barcodes.
7. Region Of Interest techniques
Several ROI techniques have been developed for specialised applications including the Pre-sort Labels, Address Labels, and also Text-Characters (AKA Segmentation). See figure 10.
The core concept of this paper is simple. Using the right tools for tasks at hand simplifies the total development process.
Figure 9. Right Tools For The Job
Each development effort that employed these techniques benefited from the previous system model development. For example, feature extraction techniques developed for one application become part of the libraries. The Modelling Environment library would contain a changeable version of the technique while the Simulation environment would contain the most recently translated objects. Once the system modelling is applied to the task at hand, a new model is translated.
The end result of this process is very strategic in that the Core Technology that the business is built upon becomes better understood, better applied, and more flexible.
Figure 10. Improved Core Technology
Another demonstration of the strategic nature of this process has been the generation of patentable ideas. So far several patents have been disclosed and the first one has just been filed.
Figure 11. Intellectual Property
Improvements to this process come from the maturity of application. For example, this process is currently being applied to Indicia recognition for the UKAI program. For the release happening this week (while authoring this paper), features have been selected based on prior designs. This is done to enable a rapid "lights-on" version delivery. After this version is integrated with the main system and after the bugs are out of it, improvements will be provided until the final planned release. This first version is built mostly by re-using components of previous designs. The changes to previous designs have been done with an extreme sensitivity to the translation-time development cycle. Subsequent releases will be less sensitive to new translations and more interested in maximising the system performance. And of course these are the correct approaches to this problem due to the complex nature of the problem (requiring training sets, etc).
The first versions delivered will be used to better characterise the problem at hand. Meanwhile, better training sets are being prepared as are additional stages to the design (ie: Greyscale stamp recognition, etc). Thus the Modelling Environment, Simulation Environment and Product Environment libraries continue to be enhanced.
I expect this approach will catch on more and more as the Technology driving the whole industry advances. Right now it's hard to imagine developing a recognition system the old way.
Keywords: SOS, SOSE, System Of Systems, Family Of Systems, consulting, information, technology, service, network, computer, systems, engineer, and management, novel, patents, research.