There are two things we have to consider when considering recommending books on working with legacy code. The first is what type of problems lend themselves to a solution using a best practice. Unfortunately, not many. The second is isomorphism.
Only simple and complicated problems can be solved with best practices. The thing that simple and complicated problems have in common is that can solved with existing knowledge. So, we can fix a flat tire as easily as we can perform heart surgery. The former is simple the latter complicated, because it requires expertise. The books I talk about here only work because they are attacking a simple problem: changing the structure of existing code.
A complex problem can only be solved with hindsight: existing knowledge cannot help you. The design of software is a complex problem, requiring hindsight, and so hindsight should be built into your method. (I.e., plan to throw many solutions away, you will anyway.) These books will not help you with your toughest problems.
Isomorphism is a posh word that Hofstader, in Godel, Escher, Bach, talks a lot about. Isomorphism is to do with the preservation of information. So, an image of the sun scratched on to a cave wall by our ancestors has an isomorphic relationship with the Japanese flag. At the heart of refactoring is isomorphism: we have to preserve the information we have, the behaviour of the code, whilst improving its structure or appearance.
Refactoring, Fowler
In Refactoring, Fowler teaches us how take an arbitrary collection of symbols (what we call code) and change their organisation whilst preserving their meaning. There is nothing more to software design. Our code needs to be as succinct, and meaningful, as it can be whilst remaining fit for purpose. Refactoring teaches us how to do this.
Working Effectively With Legacy Code, Feathers
Michael Feathers says, in this book, that ‘legacy code is simply code without tests’. Moments later he elaborates, telling us that,
Code without tests is bad code. It doesn’t matter how well written it is; it doesn’t matter how pretty or object-oriented or well-encapsulated it is. With tests, we can change the behavior of our code quickly and verifiably. Without them, we really don’t know if our code is getting better or worse.
Here Feathers highlights the relationship between tests and design, the former enables the latter. My definition of legacy incorporates Feathers’: code that is hard to change is legacy. Code with decent tests is easier to change then code without them. But, code that is decomposed is also easier to change, as is code that relies on macros, as is code that is written in an object-oriented language. So, even though not fully agreeing with Feathers, I would recommend this book as its contents are second to none.
Legacy Code is about how to get some tests written so we can verify that the changes we might make preserve the meaning of the code. Legacy Code might could well have been subtitled, Applied Refactoring.
Refactoring to Patterns, Kerievsky.
This book is the link between design patterns, which are well-known solutions to programming problems, and refactoring. Jorrit and I always says that a solution, to either a code problem or an organisational one, is easier to evolve to when some intermediate steps are known. This books provides these intermediate steps.
Code Complete, McConnell
This books is probably the most important book in the list. It offers best-practice solutions to simple problems, such as how to defend your code against invalid inputs, how to design loops and how to integrate code. This may seem rudimentary, but many developers either don’t know or have forgotten these basics.