Monday, February 7, 2011

Non-DI code == spaghetti code?

A previous post describing how Dependency Injection differs from using a Dependency Injection container (largely in the context of static languages) contained the following table:

No DI ContainerDI Container
No DISpaghetti CodeService Locator Anti-Pattern
DIManual Dependency InjectionEnhanced Dependency Injection

Some took exception with the characterization of non-DI code as spaghetti code. My intent was not to be derogatory but was to succinctly describe the increased complexity of the control flow and the increased coupling of dependencies that occurs without dependency injection (due to the mixing of creation and logic concerns).  This post examines the use of this label and also contains an invitation to suggest an alternate metaphor for describing the structure of non-DI code in the context of an explanation of dependency injection.

Wikipedia describes spaghetti code as "a pejorative term for source code that has a complex and tangled control structure."  I hadn't intended to be uncomplimentary, but I do assert that the mixing of concerns caused by the lack of DI does indeed create a more complex and tangled flow.  Without dependency injection:

- creation concerns are mixed with the control flow (that is, with the application logic)
- object lifetime concerns are mixed with the control flow

Because creation instructions unrelated to the application logic are mixed in with the control flow, the control flow is (in my opinion) more complex and tangled without DI than with DI.  However, 'more complex and tangled than X' doesn't necessarily mean 'complex and tangled'.  So at what point does the mixing of concerns or additional complexity move from acceptable to spaghetti code?  Is the increased complexity in flow without DI enough by itself to merit term spaghetti code?  How many responsibilities can a unit of code have before turning into spaghetti?   How tight can couplings become before they are too tight?  I don't think that these questions can be answered outside of an individual's judgement within a particular context.  For my context in the original post of a high level comparison of the structural differences between DI code and non-DI code, I found the metaphor illustrative.

I also consider it more likely that without DI there will be increased flow complexity and dependency complexity due to reliance on global state (such as GoF-style singletons) and service locators.  Without a context-independent dependency management strategy like DI provides, I feel that these problematic dependency management strategies are more likely to be used (and hence mixed in with the logic).  This is of course not a given, however.

Beyond control flow concerns, I find it illustrative to use a spaghetti code metaphor in relation to object graphs.  When the dominant paradigm was procedural, it made sense to define spaghetti code strictly in terms of control structure.  But do people really write (for example) many goto statements these days?  Now that object-oriented programming is the dominant paradigm, high coupling between components seems to be a more common problem than, say, unstructured branching.  Of course, complexity in OO control flow could be considered a side-effect of high-coupling between objects since the objects in a system express the control structure.  Use of a spaghetti code metaphor is entirely appropriate in my opinion for describing high, unnecessary coupling among objects.

Is it possible to write good code without DI?  Of course.  People have been doing that for a long time and will continue to do so.  Might it be worth making a design decision to accept the increased complexity of not using DI in order to maximize a different design consideration?  Absolutely.  Design is all about tradeoffs.  But the point is that this should be a conscious decision informed by an understanding of what is being lost and what is being gained.

It is of course possible to write spaghetti code with DI too.  I did not mention this in the original post because it was about contrasting properly-applied DI with other alternatives and was not about describing poorly-applied DI.  My impression is that improperly-applied DI leads to worse spaghetti code than non-DI code.  It's essential to understand guidelines for injection in order to avoid creating additional dependencies.  Misapplied DI seems to involve more problems than not using DI at all.

I suspect that much of the antipathy for dependency injection stems from experience with poorly-applied DI.  Someone might start on a project where DI has been misapplied or where there has been 'container abuse' and then conclude that DI is nothing but hype and that its practitioners are a cargo-cult.  It would be understandable in that situation to not distinguish between the baby and the bath water, but it would be a mistake to ignore a valuable tool because of the result of its misapplication.  If someone attempts to use a lawnmower to paint a house, should they then conclude that lawnmowers have no value after viewing the result?

In any case, my purpose in using the term 'spaghetti code' was to illustrate the structural differences between a non-DI approach and a DI approach for those who may not yet understand the difference.  I think the metaphor does work in this context, but the term is indeed pejorative.  If you have a metaphor that illuminates this difference without being uncomplimentary, please leave a comment.  I'm interested in suggestions.

18 comments:

  1. Food for your brain

    - http://gbracha.blogspot.com/2007/06/constructors-considered-harmful.html
    - http://gbracha.blogspot.com/2007/12/some-months-ago-i-wrote-couple-of-posts.html

    ReplyDelete
  2. I don't need to be convinced about this - but I think it would be much better argument if you analyzed some examples of code with and without DI.

    ReplyDelete
  3. @zby: I didn't want the discussion to revolve around DI in a specific language (instead of around the concept in general), so I've avoided using code samples. This approach does have its drawbacks, as you note. Examples would certainly be valuable in helping to illustrate the big picture, so I'm going to have to consider how I can perhaps strike a better balance in future posts. Thanks for the feedback!

    ReplyDelete
  4. @Anonymous - Thanks for the links. They were indeed interesting. And I got a good laugh upon seeing the title 'Lethal Injection'.

    ReplyDelete
  5. Here is an example of typical business app code which does not use DI:

    // Sends a notification e-mail by using the Apache Commons Email API.
    public class MyBusinessService {
    public void doSomeOperation(SomeData data) {
    // do some other stuff
    Email email = new SimpleEmail();
    email.setSubject("... a subject...");
    email.addTo(data.getCustomerEmail());
    email.setMsg("...the message...");
    email.send();
    }
    }

    And here is a unit test, which exercises the method in isolation from the Email API (using the JMockit mocking API):

    public class MyBusinessServiceTest {
    @Test
    public void doSomeOperationShouldSendEmail(final SimpleEmail mock) {
    SomeData data = ...create suitable data...
    new MyBusinessService().doSomeOperation(data);

    new Verifications() {{ mock.send(); }};
    }
    }

    I really don't see how this code could possibly benefit from DI.

    ReplyDelete
  6. @Rogerio - We have a difference of opinion because I see that design as being a great candidate for DI. But really it has less to do with DI than with just general design principles, but an improved design would use DI in my opinion.

    Let's take a look at what I would consider to be the design issues. First, the SimpleEmail class is tightly coupled to an email-sending implementation in its send() method. Because of this, the design does not adhere to the Open/Closed Principle. In order to use a different email implementation, either the MyBusinessService class or the SimpleEmail class is going to have to be modified (hence violating the 'open for extension, closed for modification' guideline).

    Also, SimpleEmail has more than one responsibility. It is responsible for both maintaining the state of the message and for actually sending the message. Therefore I don't consider its design as being in the spirit of the Single Responsibility Principle.

    Finally, the email-sending dependency is hidden. To me, this is really the most problematic issue. There's no way for someone to know about this dependency by looking at the definitions of either MyBusinessService or SimpleEmail. In the general case, a developer is only going to learn about the dependency through some sort of testing pain [errors] and then trial-and-error digging to find the source of the errors. "Hmmm... I'm getting a weird error about a mail server in this test". Then they'll have to dig deeper and will eventually discover that they essentially have to remove part of the class (by mocking it out) to run successfully. Personally I prefer the dependencies to be explicit. It's much easier, especially for others who may work on your project or who may maintain it later. There's inconsistency in that some of SimpleEmail's methods (setSubject, addTo, setMsg) don't need to be mocked in tests, but another one (send) does, and that distinction is not made clear by the class definition. When a subset of a concrete instance needs to be removed (i.e. mocked out), then it's a strange situation where a concrete instance is both a test double and the component being tested. To me, that's a code smell telling me that something in the design could be improved.

    Personally if I were implementing this, I would create a separate 'email sender' service, remove this responsibility from SimpleEmail, and inject this new service into MyBusinessService. SimpleEmail then can just be a simple Value Object. Everything is then explicit, and the responsibilities are separated. This is how I see DI as being useful to this design. It's a simple change that brings many benefits.

    I agree with you that *by itself* this test is perfectly manageable. I'm not sure that this would be true of all tests that use SimpleEmail though. However, if you add a few more classes with implicit and hidden dependencies like this, then when doing an integration test or even just test setup for a unit test, things can get out of hand quickly. It's a 'boiling frog' situation - where each problem by itself doesn't add enough pain or cause enough problems to warrant much notice, but many of them combined across a system can result in 'death'.

    ReplyDelete
  7. Hmm.. I could easily write a much longer text explaining why I disagree with most of what you said, but in the end we would only agree to disagree. (For example, I think that OCP is a bad, outdated principle which has no place in modern software development - consider version control, refactoring tools, TDD, agile, etc. which were not in vogue in the 1980's.)

    The Apache Commons Email API could be improved, sure, but the fact is that it *is* very easy to use - and that's what really matters in practice.

    About the supposed issue of "hidden" dependencies, I think you forget about the *information hiding* principle. There is no need for clients of MyBusinessService to even know that it sends an e-mail, much less which API it uses to do it. Unless you have an actual business/technical requirement that calls for it, such information should not be exposed to client code.

    I have actually worked with large codebases in projects lasting years, where code similar to what I have shown was the order of the day. And it worked just fine, with new releases delivered regularly. My basic rule is that simplicity and effective solutions (proven in real-world scenarios) come first. So, if principles/patterns/practices like OCP and DI don't demonstrate value in real projects, why insist in following them?

    ReplyDelete
  8. @Rogerio - Yes, it sounds like we would likely agree to disagree. Personally I've found a lot of benefits with this approach, and I find the SOLID guidelines to be helpful and very relevant today. Luckily there's never One True Way. It sounds like you've found an approach that works well for you, and I always enjoy hearing others articulate the benefits of the differing practices and approaches that they use. After all, if I never consider any alternatives, how can I ever improve?

    I am however going to disagree with you about there being a difference in information hiding between the two approaches in terms of a client's view of MyBusinessObject. With the proposed design using DI, clients of MyBusinessObject do not know any more about the internals than they do in your approach. DI will know more about MyBusinessObject at startup because part of its job is to wire up the app. But components that use MyBusinessObject as a collaborator have no knowledge of that wiring. In general, I find the DI's separation of construction logic and application logic to be very conducive to information hiding. In any case, both approaches offer clients the same view of MyBusinessObject.

    ReplyDelete
  9. Yes, use of DI does not require the dependencies of a class to be publicly exposed to clients. I thought that's what you meant with "I prefer the dependencies to be explicit".

    In the case of instantiating SimpleEmail, though, there is no "construction logic". You simply create an object suited to the task at hand, much like we instantiate an ArrayList; it's ok in cases like this because there is no need for selecting between multiple implementations based on external configuration.

    DI is supposed to be about "the principle of separating service configuration from the use of services within an application" (from http://martinfowler.com/articles/injection.html). In the case of the Email service the actual configuration is done through Java system properties ("mail.host", etc.). DI *could* be used, but it would be a more complex solution.

    ReplyDelete
  10. What I had described was not injecting anything into SimpleEmail but instead to break the email-sending functionality out of SimpleEmail into a separate mail service. This mail service would then be injected into the MyBusinessService class (instead of into SimpleEmail). SimpleEmail becomes a simple ValueObject then. I agree that injection of a mail service into a newable (SimpleEmail) would be undesirable, but injection into a service (MyBusinessService) presents no such problems.

    What I was attempting to point out regarding explicitness is that the injection of the mail service into MyBusinessService makes the dependency (on mail sending functionality) more explicit (since it would then be in the form of a required constructor parameter) compared to having the dependency managed by SimpleEmail::send (which either has to call a procedural API or has to inline instantiate some sort of mail service object - neither of which would be apparent by looking at the SimpleEmail definition).

    ReplyDelete
  11. The Commons Email API is just a wrapper around the standard JavaMail API, designed for ease of use in common scenarios. The Spring Framework provides the same, but breaking the functionality in the way you suggest, with an injected MailSender object and a SimpleMailMessage value object.
    In practice, both APIs work well, with the Commons API being more straightforward (it does take less code overall).

    It should be noted that use of DI tends to encourage non-OO designs, with a separation of code between stateless services (usually singletons) and "anemic" domain objects. The MailSender/MailMessage separation is an example of that. I find the design of the Commons Email base class, with its "send()" method, to be indeed more OO.

    ReplyDelete
  12. I don't agree that dependency injection tends to encourage non-OO designs. While an emphasis on services is a departure from traditional OO design, it is just a different style of OO design rather than a non-OO design. In fact, I would say that DI encourages what is, in my mind, a better style of OO design and modelling.

    For example, take the SimpleEmail class. SimpleEmail models a message which sends itself. But is this an accurate model? Does a letter send itself In the real word? Does an email send itself in the real world? Does an SMS message send itself? In all cases, the answer is no. All are sent by services [postal delivery, sendmail server, SMS provider]. A design that explicitly models these services is, in my opinion, not only a more accurate model of the real world but also a better design overall (improved testability, division of responsibilities, intention-revealing interfaces, etc).

    In commonly-practiced OO, there is a strong emphasis on nouns. Verbs only exist when attached to nouns. There's a discomfort with explicit modelling of verbs (that is, services). But a service or a business process is a 'thing', and modelling them as such is entirely appropriate and helpful in my opinion.

    As further evidence that incorporation of services doesn't make a design procedural, consider Domain Driven Design. While DDD has nothing to say about dependency injection, it does give services a prominent role, and it would be difficult to argue that DDD is not an OO approach.

    And I don't find that the domain objects are anemic in this style of OO because everything does exactly what it should do. Everything performs its responsibility, no more and no less. In this particular example, the message has the responsibility of maintaining its state, and the mail service has the responsibility of sending the message. While some may find the message class anemic because every verb that has a message as an object isn't part of the message definition, I personally find its level of responsibility just right.

    However, note that the behavior to send a message can still belong to SimpleEmail even with separation into services. SimpleEmail can still have a send() method, but instead of being parameterless, it would take the mail service interface as a parameter. Such a design introduces no testability problems (like there would be if the mail service was injected into SimpleEmail), and that way the behavior still resides with the object.

    ReplyDelete
  13. I think I already proved (see code example in my first comment) that there are no testability problems or even difficulties in unit testing a class which uses APIs like Commons Email (or, in general, that code which makes no use of DI is just as testable as code which does).

    You're right that a mail message sending itself does not model the real world. I don't really care about "modeling the real world", though, but about effective and pragmatic software development. What I see in practice is that developers tend to implement service classes as stateless singletons *by default* when using a DI framework, because it's the "path of least resistance". And this is certainly less OO than making those services stateful and created on demand as needed. I have used this approach extensively, with hundreds of stateful business service classes created with "new" whenever needed; it has worked very well, with several advantages such as the ability to use final fields and to have a trivial service lifecycle model; and at the same time being able to easily write unit tests.

    ReplyDelete
  14. As you indicated near the start of this thread, I think we're likely to agree to disagree on this topic. Either approach can obviously be used to build a system successfully. However, I don't see the two as equivalent in terms of testability by my definition of testability - the quality that allows a component to be easily tested in isolation. This may or may not matter depending on design goals, but I think that there is an opportunity for increased testability of the design if there was a desire to emphasize that aspect of the design.

    SimpleEmail has a tight coupling to an external service and therefore is not testable is isolation. *Most* of the class can be tested in isolation, but not *all* of it can be (that is, the send() method can not be). Therefore according to my definition [and you may have your own definition], I don't consider the class to be testable. In the example test, part of the functionality of the class has to be disabled (i.e. mocked out) in order to test it, so SimpleEmail is simultaneously both a test double and a component under test. That seems like a code smell to me from a testability perspective. In any case, my point is that the class doesn't fall into the definition of testable that I use, since the entire class can not be tested in isolation. I realize that you may use a different definition of testability (and perhaps this is the source of our differences in opinion and approach).

    You mention direct 'new'-ing of services also. I know this is another area in which we probably won't come to a consensus, but I view direct 'new'-ing of services as a testability problem, since it doesn't allow for a (language-based rather than tool-based) seam in consumer of the service for substitution during testing so that the consumer can be tested in isolation from the service.

    I already described in my previous response why I find modelling activities (i.e. services) that aren't naturally a part of an entity or a value object to be fully consistent with an OO approach, so I won't repeat those here. I do agree with you that services in DI are typically single instances and stateless, but I view that as a strength and desirable rather than as a problem. It's consistent with the definition of a service in the DDD (Domain Driven Design), and it makes sense for an activity (verb). Statelessness allows a client to use the instance without the history of usage affecting the consumer.

    ReplyDelete
  15. Actually, I think your definition of testability is very good. I don't disagree with any part of it. The only disagreement, I suspect, would be in the "specifics about what makes it simple or difficult to test"; but these specifics lie outside the definition.

    Regarding SimpleEmail, note that I never intended to discuss the testing of this particular class, but of client classes such as "MyBusinessService". SimpleEmail actually is a very small class extending the base Email class, which does all the interaction with the javax.mail API (the external service you mentioned). I can't be sure without a deeper study of the code, but Email doesn't seem to be semantically coupled to javax.mail classes; that is, it most likely is loosely coupled to the implementation behind that API. (I am referring to the modern definition of "coupling" here, as described in the "Code Complete 2" book.)

    In my example test, the SimpleEmail class and its superclass Email are completely mocked, so that only MyBusinessService, the client class, is part of the "code under test". (I could have written "@Mocked SimpleEmail mock" as the mock parameter instead of simply "SimpleEmail mock", to make this more clear, however.) So, in that particular test "SimpleEmail" was indeed a "pure" test double, nothing more.

    Mocking, in its most fundamental sense, is about substituting the *implementation* behind some interface for the duration of a given test. The idea that you need "seams" to enable the substitution is an artificial and unnecessary restriction. In Java, you could do the substitution *without* using any sophisticated tool, by simply adding a mock implementation to the classpath instead of the original/real implementation (for example, instead of commons-email.jar, have your own "SimpleEmail.class" in the classpath). Of course, doing so manually would be cumbersome (even if not difficult), so you use a tool or language feature instead; in any case, no particular seams are required.

    ReplyDelete
  16. Jeremy RosenbergJune 5, 2012 at 2:03 PM

    For anyone else coming late, what Rogerio is suggesting is the equivalent of Service Locator. Java's class loader is a built-in service locator. This is the "No DI"/"DI Container" slot at the top of this article, with the JVM providing Container-ish features.

    ReplyDelete
  17. If you were doing static spaghetti without DI, you'll just be doing dynamic spaghetti with DI.

    Avoiding spaghetti code is not a matter of DI or no DI.

    And arguably dynamic spaghetti is far worse than static spaghetti as it can't be safely refactored with compiler support, there will always be some dynamic spaghetti you'll miss.

    So no magic wand to be found there.

    ReplyDelete
  18. @Anonymous - I largely agree with you. And I certainly wasn't attempting to describe DI as a magic wand. DI is neither sufficient nor necessary for clean code.

    However, I believe that by separating dependencies from logic, DI provides a structure that helps developers who care about such things to achieve them. It's likely not going to help someone who is unconcerned with separating those concerns. I have found it helpful for those who do care or would be interested but do not know otherwise know how to achieve the separation.

    ReplyDelete