Software Development Paradigm Trap main page

Feedback: Software Development Paradigm Trap

James S. Gibbons - 5/15/2006

Hello Mr. Bereit,

I just finished reading your article and enjoyed it very much. I tend to agree with some of it but feel that there are basic differences between SW and HW engineering that make it hard to compare the two:

1) There are many failures in hardware engineering. Look at the first passenger jet airplane the British designed. It fell apart in flight because the margins were not well exceeded. Also look at the recalls on Toyota vehicles in the past few years. Steering components are falling apart. I don't see where hardware engineering is any better at solving the problems than software.

2) In hardware design the engineer does the design work and others draw up the prints and actually build it. The CAD worker and the assembly line worker both provide checks on the final product. Where management doesn't care about quality and the workers aren't allowed any say in the production is where we see design failures multiply. Software design is many times done without the QA workers checking the final assembly and in many industries the programmer does the testing because the cost of a QA team would be too great or management doesn't see a use for it. Just look at how Programmable Logic Controllers (industrial computer) are programmed on-site at the job and also modified by plant electricians to tweek the program until it runs bug free rather than being properly engineered in the first place. I bet that most PLC programs are developed on-site during the mechanical installation of the equipment they control.

3) Throwing multple processors at a task is not really any different than using a multi-tasking OS or multiple core processors, provided the OS provides the speed and accuracy needed to get the job done. The software can be divided into modules that are quite easy to maintain in existing systems provided it is done right. The only reason I see for breaking a task up onto multiple processors is when the memory or I/O bus doesn't provide the performance needed using a single processor.

4) Software failures are many times out of our control:

While I have had bugs creep into my projects from time to time that are my own doing, the worst bug I encountered was related to the Intel Pentium 4 and was totally out of my control. It shut down our ability to ship working product and created a major problem requiring upgrades to many shipped systems. When we made the switch from the P3 to the P4, we started having issues with grab board input operations freezing up. Our supplier, Matrox, offered no help with the problem. It seems the P4 platform will sometimes miss an interrupt and this would cause DMA transfers to hang and fail. I had to switch to another grab board supplier which used a DMA engine that ran in hardware and didn't rely on software interrupts to keep it running.

Thanks for the article as it has made me think about some ways to improve what I am doing. I already try to use HW design concepts such as state machines in an attempt to make my software more predictable in operation. While there is much work being done by companies like Rational/IBM in software ALM, they seldom address the real-time embedded market as IT is usually only interested in databases and the Web.

--James S. Gibbons

Mark Bereit - 5/17/2006

James,

Thanks for your comments.

I don't dispute that my analogies only go so far, and I agree that hardware systems can be done as badly as software systems! My article was intended to challenge the Frederick Brooks premise that software can't be done well, and to in general prompt people to think differently about development in hopes of sparking some useful models.

But I do think that splitting across processors is fundamentally different than splitting across processes. When I work on a multi-threaded or multi-process system my background design goal is always to use as few threads as I can, because threads introduce overhead, complexity and risk. The whole point of mutexes and critical sections is to try to selectively preclude asynchronous activities that can jeopardize your top-down code. I can't think of anything remotely analogous to this in any other discipline. It says we are smart enough to be afraid of what a properly running system might do to us! But when I partition off functionality to another processor, I'm never trying to have one processor halt, suspend, lock out or otherwise micro-manage the behavior of another. My PC code never attempts to suspend the MCU in the keyboard, or the one in the mouse, or the one in the hard drive, etc. The processors simply communicate pertinent information among themselves. Yes, we could do just that in software... but we don't.

This past week I've been working on some desktop Windows code using a number of COM objects hooking up various interfaces with each other, and trying to debug the result. It's a mess. When I ask myself the difference between these COM objects living on the same processor and comparable objects living on different processors, the most obvious difference is, if they were physically separated, I could put a logic analyzer on the link and see how their conversations go. But no, it's software where I'm trying to figure it out with OutputDebugString calls on those interfaces my code receives, or worse yet, setting breakpoints. I certainly can't step through the behavior, there are too many threads involved, most of them not mine. So I have much of the overhead and quirky interconnect requirements of hardware, but without the advantages of crash protection or monitoring ability.

I truly believe that more time spent thinking about systems that can't share memory and pointers would do us all a world of good, even if we ultimately came back to a single core for later implementations.

But that's my opinion. If my article gave you any new thoughts to consider, I'm glad. And if you have more ideas to throw my way, please do. The more people thinking about this, the better our chances of improvement.

Thank you for writing!

Mark Bereit

James S. Gibbons - 6/15/2006

Hello Mark,

> This past week I've been working on some desktop Windows code using a number of COM objects

Oh, you poor fool! :)

I absolutely hate COM! It was poorly specified and some of the documentation was more misleading than useful. There are many COM objects that have bugs when used in multi-threaded mode just because someone followed the official specifications which are incorrect. To save some time on my latest project, I used Fanuc's COM based SDK interface for their robots. With a lot of work, I got the basic part I needed to work properly, but other parts of it clearly will crash when TCP errors occur (due to broken connections or power failures). The other thing I don't like about COM/ActiveX objects is that they send messages using the Windows message queue and this doesn't work very well for threads (or at least I haven't figured out how to make it work yet).

The only other thing I hate worse than COM is vendor libraries supplied as C++ interfaces where you need to use a specific compiler to work with the library. That is one reason I am not using the GigE camera interface provided by www.pleora.com yet. I much prefer standard C interfaces, such as that provided by the Matrox MIL library. Matrox also provides an ActiveX and .NET interface, but at least they base these on the lower level C interface which is exposed. I wish more vendors followed this route.

After reading all the comments to your article, I tend to agree more with the idea that we need software building blocks and a good multiple-processor architecture. But I also see potential problems with trying to implement this.

The main problem with trying to invent useful software building blocks is that the number one software company is more interested in pushing out new features than providing the stability needed to provide an environment for these building blocks to be created in. See these links for the details of why modular software will always be a moving target on Windows:

http://www.joelonsoftware.com/articles/APIWar.html

http://www.joelonsoftware.com/articles/fog0000000339.html

While it may be possible to produce software modules within a company using a single language, doing this across languages will be difficult. It was tried with COM and now we must all switch to .NET because the COM/ActiveX components are largely being mothballed. Having played around with chips in my college days, I really liked the comments about 74XX ICs and how they standardized hardware design, but I don't see this happening in software. The field is too dynamic and fast moving.

In case you also haven't noticed, MS is also at war with the embedded community and is trying to take over all OS market share there too. I don't think they will have total success with this simply because their bloat-ware simply won't fit or work in many of these applications.

The main problems with multiple-processor architectures, as I see it, is that there is no good way to pass large amounts of data between them quickly. When I first started the design of our veneer grading vision system, I looked very carefully at multiple processors and high speed message passing hardware from Myricom as Gigabit Ethernet was not yet available. The fastest processor was the Pentium 233 at that time and doing it with a single processor was impossible. Using DSPs was also not looking very fun because I would need to write all the software and could not leverage libraries like Matrox MIL. Fortunately, Intel came out with a dual Pentium 2 platform running at 300 MHz by the time I had the software done and this turned out to be fast enough to do the whole job. Another key was MMX which sped up the image processing.

About the time that the P4 came out, it sped things up again and we decided to go with color images which further improved the performance of our grading. Using a message passing system to process the color camera data would be difficult. The amount of image data flow exceeds 100 MHz Ethernet bandwidth and even overloads the Windows network stack at Gigabit speeds. This is the reason why GigE camera interface solutions, such as Pleora's, use a custom stack to bypass Windows.

What we really need to make multi-processor systems work is a good high speed hardware solution for message passing. Perhaps placing the processors on a high speed bus like PCI-Express would work. They could either DMA packets to each other or use shared memory for communication.

Thanks again for getting me to think about this. I am currently looking for a solution to an image processing system where the image data is input and tracked over several feet with high accuracy, processed during this tracking and outputs are operated in sub-millisecond accuracy at the output point. We currently do this with a system developed around 1995, using assembly language in three processors on STD bus, with shared memory and RS-232 communication. It sure would be nice to upgrade this to PC104, but I am still looking for the ideal architecture.

--James S. Gibbons

Mark Bereit - 6/16/2006

James,

Thank you for your further thoughts!

"You poor fool" strikes me as an entirely reasonable response to my working with COM. I try not to do this any more than I have to. Of course, avoidance doesn't help my level of understanding. COM is an interesting topic for discussion, though, because it seeks to leverage C's pointer anarchy and C++'s virtual function implementation with a way across executable boundaries and a way of tracking object lifespan, neither of which C++ addresses. But at least in the C/C++ world, COM is far easier to do wrong than to do right.

As to Microsoft's dominant role in the software development status quo, I agree that there is inertia against any change in the world of general-purpose application-engine PCs (and this inertia also comes from Apple and Unix/Linux, let's not forget). But I think that the embedded development world is the place for discussing change: this area is more able to set aside past convention in favor of what works better. And if we have real success in this space, the principles will be adopted by the big players.

Mark Bereit