Choosing composition over inheritance: yet another example!

Julien on May 30th 2008

A friend of mine recently asked me if I though that having an inheritance hierarchy with a depth of 10 classes was acceptable or not. I don't think it is. If you take a class at the bottom, any change to the 9 classes above can introduce a breaking change. It can quickly become a maintenance nightmare!

Most of the time, if you have that kind of hierarchy in your application, it means that you chose inheritance over composition. It's probably also a code-smell indicating that your class is doing too much. To avoid that, let me show you how it is possible to extend a class without using inheritance.

Let's assume that we start with the following code. I know it's very basic, however it's still enough for our demonstration.

  1. abstract class TripBase
  2. {
  3. private readonly string _from;
  4. private readonly string _to;
  5.  
  6. public string From
  7. {
  8. get { return _from; }
  9. }
  10.  
  11. public string To
  12. {
  13. get { return _to; }
  14. }
  15.  
  16. protected TripBase(string from, string to)
  17. {
  18. _from = from;
  19. _to = to;
  20. }
  21.  
  22. public abstract DateTime CalculateEstimatedArrivalTime();
  23. }
  24.  
  25. class CarTrip : TripBase
  26. {
  27. public CarTrip(string from, string to)
  28. : base(from, to)
  29. {
  30. }
  31.  
  32. public override DateTime CalculateEstimatedArrivalTime()
  33. {
  34. return DateTime.Now.AddHours(15);
  35. }
  36. }
  37.  
  38. class PlaneTrip : TripBase
  39. {
  40. public PlaneTrip(string from, string to)
  41. : base(from, to)
  42. {
  43. }
  44.  
  45. public override DateTime CalculateEstimatedArrivalTime()
  46. {
  47. return DateTime.Now.AddHours(2);
  48. }
  49. }

We have a BaseTrip abstract class and we want to extend it by having 2 sub classes: CarTrip and PlaneTrip. In a real application, we would have some kind of algorithm to calculate or fetch the transportation time and it would probably be very different for each class. In this example, to simplify, we just return a hard coded value.
This code is not bad, however it's not perfect either. For instance, are we sure that calculating the transportation time between 2 city is the responsibility of the CarTrip and the PlaneTrip class? As a matter of fact, if we want to add a new MotocycleTrip class that would have different properties but the same algorithm than CarTrip to calculate the transportation time, we would need to extract a new abstract superclass that we would probably call RoadTrip... It doesn't look good...

Here is a refactored version of these class, using composition instead of inheritance:

  1. class Trip
  2. {
  3. private readonly string _from;
  4. private readonly string _to;
  5. private readonly ITransportationMode _transportationMode;
  6.  
  7. public string From
  8. {
  9. get { return _from; }
  10. }
  11.  
  12. public string To
  13. {
  14. get { return _to; }
  15. }
  16.  
  17. public Trip(string from, string to, ITransportationMode transportationMode)
  18. {
  19. _from = from;
  20. _to = to;
  21. _transportationMode = transportationMode;
  22. }
  23.  
  24. public DateTime CalculateEstimatedArrivalTime()
  25. {
  26. return DateTime.Now.Add(_transportationMode.CalculateTransportationTimeBetween(_from, _to));
  27. }
  28. }
  29.  
  30. internal interface ITransportationMode
  31. {
  32. TimeSpan CalculateTransportationTimeBetween(string from, string to);
  33. }
  34.  
  35. class CarTransportationMode : ITransportationMode
  36. {
  37. public TimeSpan CalculateTransportationTimeBetween(string from, string to)
  38. {
  39. return new TimeSpan(15, 0, 0);
  40. }
  41. }
  42.  
  43. class PlaneTransportationMode : ITransportationMode
  44. {
  45. public TimeSpan CalculateTransportationTimeBetween(string from, string to)
  46. {
  47. return new TimeSpan(2, 0, 0);
  48. }
  49. }

In a few words, I've extracted the calculations in their own classes that implement an ITransportationMode interface. They are injected in the Trip class through the constructor (but we could also use a setter injection or a method injection).

This implementation improves the code in several way:

  • The calculation of the transportation time is done in a new class, it's not polluting our Trip class any more. It's a lot easier to add new algorithms, we just need to add a new implementation of ITransportationMode. It can evolves independently of the Trip class. It's called the separation of concern principle.
  • We can more easily configure our Trip class. Before, wa had to create a new instance of a TripBase sub class to change the way the calculation was done, now we can just change the instance of ITransportationMode while keeping the same Trip object.
  • We can have several subclasses of Trip using the same algorithm without introducing a new abstract class (such as RoadTrip). Therefore, composition helps us to keep our hierarchy of class flat.

kick it on DotNetKicks.com

Filed in General development | 9 responses so far

The null singleton

Julien on May 24th 2008

I'll say it straight: I really don't like the Singleton pattern. I think this pattern was great a few years ago, before we started to use things such as dependency injection or dependency lookup. Unfortunately for me, as of today, it's probably the most used pattern in software development. To be fair, this pattern can be useful in several scenarios, but please... not everywhere! (For the record, the a singleton is a pattern that ensures that you have only one instance for a specific class. This instance is created "on demand", the first time the class is used.)

The main problems with singletons are obvious:

  • You don't control when an instance is created, so if you have lot of them you have no idea in which order the instances will be created. Not an issue on small software, but more worrying when you have a few hundred thousands of lines.
  • 99% of the time, the implementation that is used binds us to a specific class. Therefore you have either to refactor the singleton or use a tool like TypeMock if you want to unit test your code. (Or you can have a empty class that holds the static instance and returns it casted as an interface... But doing that results in a multiplication of almost useless classes)
  • Dependencies are not obvious (which can also be a problem with dependency lookup). If you have type A that uses a singleton of type B, nothing in A signature will show you the dependency. It might be a problem if you're not aware that using A results in a hundred of round trips to the database through B!

However, I recently discovered a pattern that is even worse than the Singleton: I call it the "Null Singleton"! It's a very simple concept: the singleton can fail to instantiate itself if some dependency (a config file, a database, or whatever) is not available. In that case, the instance returned by the singleton will be null. Let's see an example:

  1. class NullSingleton
  2. {
  3. private static NullSingleton _instance;
  4.  
  5. public static NullSingleton Instance
  6. {
  7. get
  8. {
  9. if (_instance == null)
  10. {
  11. if (CanAccessMyDependency())
  12. {
  13. _instance = new NullSingleton();
  14. }
  15. }
  16.  
  17. return _instance;
  18. }
  19. }
  20.  
  21. private static bool CanAccessMyDependency()
  22. {
  23. // check if you can connect to the database for instance.
  24. }
  25.  
  26. private NullSingleton()
  27. {
  28. }
  29.  
  30. }

So what is wrong with that?

  • If the dependency is unavailable, the NullSingleton will check again the dependency each time something will access it. It will probably be very slow (usually, it will be an out of process call).
  • Each time you try to get an instance, you'll have to check for a null reference which makes the code noisy
  • And finally, it's breaking the usual expectation about singletons (that it should return 1 instance)

Do yourself a favor: don't use a "Null Singleton" :-)

kick it on DotNetKicks.com

Filed in General development | 2 responses so far

Spec#: should we really wait for it?

Julien on May 13th 2008

A friend of mine just posted about his desire to have Spec# ready for production. Funnily, many people in Alt.net expressed a similar desire in the past weeks (See that or that).

I don't have any problem with Spec#. On the contrary, I believe that it's a great addition to the language, one that would result in a significant improvement in code quality. However, I also don't think we should expect anything soon for various reasons:

  • Spec# is only a research project. Even if Microsoft did publish previews, I don't think they've communicated any plan to have a RTM yet.
  • Spec# is an addition to C#. However, there's now a lot more than C# in .NET, and I think that it would be problematic to add this feature to C# and not to vb.net for instance (which is, believe it or not, more used than C#). Actually, what happens if you build an assembly with Spec# and use it from C#/Vb.net? I don't have any clue but it would be interesting to know.
  • It will be difficult for Spec# to enter mainstream, enterprise will be reluctant to migrate to it. Moving from .NET 2.0 to .NET 3.5 is absolutely painless and still, do you know many companies that are actively working with .NET 3.5?
  • Spec# adds several new keywords which will be as many new concepts to understand before being able to work with it. With the recent additions to the framework, I'm starting to consider that there's already too much to learn for the average developer! Just an example: how many of your colleagues know what's the meaning of the "??" operator in C#? Or another one: I had to explain what was a Nullable only a few days ago :). If you add Linq, WCF, WPF, Entity Framework...

The bottom line is that I believe that we should focus our efforts on improving the way we do defensive programming with the framework as it is, and not hope for a white knight. Even if they are not perfect solutions, I think we can cover a lot of ground by systematically checking the inputs, checking the post conditions/side effects with unit testing, and checking that our object are satisfying invariants.

Filed in .NET | One response so far

Performance impact of the readonly keyword

Julien on Apr 22nd 2008

I've heard from a co-worker that the readonly keyword was optimizing memory access and therefore quicker than a normal field access. I must admit that I found this assertion a bit surprising. My first bet would have been that the performance are identical with or without it (it's just a compiler check, but it doesn't result in any change at execution). If wrong, I then would have bet on a slower access (I would expect the CLR to do some checks when accessing the field). Let's see if I'm mistaken :).

Unfortunately for me, I didn't find anything to confirm or not my theories, so I decided to do my own micro-benchmark.

I measured the time spent for 3 main scenarios, with a read only field and without it:
- Scenario 1: Creating an object
- Scenario 2: Accessing the field
- Scenario 3: Creating an object and accessing the field

I executed this code 1,000,000,000 times for each scenario and ran each scenario 10 times on 4 different PC to get a meaningful average. For each test, I excluded the top and bottom result.
The code (in C#, .NET 2.0) is available here (It is duplicated for each test, I know it's horrible, my apologies :)), and the results here.

Each test looks like that (with sligh modifications of course):

  1. _streamWriter.WriteLine();
  2. _streamWriter.WriteLine("Obj creation & access in loop");
  3. for (int j = 0; j < _numberOfTestExecutions; j++)
  4. {
  5. Poco poco = new Poco();
  6.  
  7. Stopwatch watch = new Stopwatch();
  8. watch.Start();
  9. for (int i = 0; i < _numberOfIterationPerTest; i++)
  10. {
  11. ReadOnlyField objReadOnlyField = new ReadOnlyField(poco);
  12. Poco myCopyOfPoco = objReadOnlyField.Poco;
  13. }
  14. watch.Stop();
  15. _streamWriter.Write(watch.ElapsedMilliseconds + ";");
  16.  
  17. watch.Reset();
  18. watch.Start();
  19. for (int i = 0; i < _numberOfIterationPerTest; i++)
  20. {
  21. ReadWriteField objReadWriteField = new ReadWriteField(poco);
  22. Poco myCopyOfPoco = objReadOnlyField.Poco;
  23. }
  24. watch.Stop();
  25. _streamWriter.WriteLine(watch.ElapsedMilliseconds);
  26. }

Here are the raw numbers:

Scenario 1: Creating an object
- read only: 25172ms
- read write: 24983ms
Read only is in average 0.76% slower than read write.

Scenario 2: Accessing the field
- read only: 6437ms
- read write: 6449ms
Read only is in average 0.19% quicker than read write.

Scenario 3: Creating an object and accessing the field
- read only: 19929ms
- read write: 19761ms
Read only is in average 0.85% slower than read write.

However, you can see greater differences by looking at the results for each computer. For instance, on the first PC I used (a core 2 duo, 2,8Ghz, DDR2 800Mhz), I have the following results:

- Scenario 1: Creating an object
Read only is in average 5.72% quicker than read write.
- Scenario 2: Accessing the field
Read only is in average 0.94% quicker than read write.
- Scenario 3: Creating an object and accessing the field
Read only is in average 2.28% quicker than read write.

And on a laptop (centrino 1,7ghz, DDR):
- Scenario 1: Creating an object
Read only is in average 6.34% slower than read write.
- Scenario 2: Accessing the field
Read only is in average 0.21% quicker than read write.
- Scenario 3: Creating an object and accessing the field
Read only is in average 3.01% slower than read write.

My (own) conclusions:
There seem to be a difference of performance involved by using the readonly keyword. The problem is that the impact is highly dependant on the hardware. On a PC with decent hardware, accessing a read only field is between 1% and 2.5% quicker than accessing a normal field. Object creation can be as much as 6% quicker. On lower-end hardware, the results are the opposite.
However, the tests have been made with very simple class, so keep in mind that in reality, the speed difference in object creation would have been smaller with classes that have more members.

As far as I'm concerned, I would not used the readonly keyword just for optimization purposes except for very demanding situations, and only after checking that it's actually quicker on the specifics machines where the software is going to run. So unless your application is:
- running on a very controlled environment with a very limited set of installations
- pseudo real-time or extremely sensible to performances
I don't think you should even wonder about the performance impact of readonly. There's probably hundreds of optimization that will be more effective before this one!

Finally, if you have any clue showing a different behaviour, please tell me about it and I will edit this post in consequence :). I'm specially interested in having benchmark with Xeon processors (unfortunately, I don't have access to one)

Filed in .NET | 6 responses so far

Keep your objects in a consistent state

Julien on Apr 12th 2008

Part of a good object oriented design is to keep the objects in a correct state.

For instance, in the financial world, we have financial instruments called "derivatives". According to Wikipedia:

Derivatives are financial instruments whose value changes in response to the changes in underlying variables. The main types of derivatives are futures, forwards, options, and swaps.

In the real world, a derivative can't exist if there isn't an underlying financial instrument associated. A future on the dow jones could not have been created without the dow jones in the first place. If we want to represent a derivative in a software, we must make sure that we follow the same rule.

So let's assume that I have the following class:

  1. class Derivative
  2. {
  3. private string _name;
  4. private Instrument _underlyingInstrument ;
  5.  
  6. public string Name
  7. {
  8. get { return _name; }
  9. }
  10.  
  11. public Instrument UnderlyingInstrument
  12. {
  13. get { return _underlyingInstrument; }
  14. }
  15.  
  16. public Derivative(string name, Instrument underlyingInstrument)
  17. {
  18. _name = name;
  19. _underlyingInstrument = underlyingInstrument;
  20. }
  21. }

Any new developer on a project that uses this Derivative class will mentally map the Derivative class to the corresponding financial concept. He will expect the UnderlyingInstrument property to return a non-null object. However, this is not guaranteed in the current implementation. As a matter of fact, this class can currently be used to only convey the name of a Derivative. If we wanted to do that, we would need to create a new class for that purpose only. So in the mean time, if we want our code to stay in a maintainable state, we need to make sure that each Derivative object will be constructed correctly. If not, different people will make different usage of the class.

In that case, ensuring the correctness of the object can be done very easily. We just need to do a bit of Design By Contract. So our constructor will become:

  1. public Derivative(string name, Instrument underlyingInstrument)
  2. {
  3. if(name == null || name.Length == 0)
  4. {
  5. throw new Exception("The name of the derivative can't be null or empty");
  6. }
  7. else if(underlyingInstrument == null)
  8. {
  9. throw new Exception("A derivative must have an underlying instrument");
  10. }
  11.  
  12. _name = name;
  13. _underlyingInstrument = underlyingInstrument ;
  14. }

Now you're sure that a fellow developer won't use your object to do a weird thing :).

Of course, you can improve the clarity of this piece of code significantly by using various techniques and frameworks(including Debug.Assert or your own Assert class). For instance, I would write something like that:

  1. public Derivative(string name, Instrument underlyingInstrument )
  2. {
  3. Guard.Against(name == null, "The name of the derivative can't be null");
  4. Guard.Against(name.Length == 0, "The name of the derivative can't be empty");
  5. Guard.Against(underlyingInstrument == null, "The underlying instrument of the derivative can't be null");
  6.  
  7. _name = name;
  8. _underlyingInstrument = underlyingInstrument;
  9. }

But that's a topic for another day!

Filed in .NET, General development | 2 responses so far

AutoResetEvent vs ManualResetEvent: beware!

Julien on Apr 7th 2008

I did load testing last week on the component I am developing and found out that the performances were just miserable... It was barely capable of handling 60 messages per seconds which was: 1) bad and 2) surprising! (Basically it's doing some kind of real-time caching/transformation/redirection of messages).

I then spent 5mins in dotrace trying to get a picture of what was going on. I found that in the thread that is monitoring the queue and sending messages, I had used an instance of ManualResetEvent instead of AutoResetEvent.

If you never used them, these 2 classes allow you to send signals between 2 threads. That way, the "monitor" thread is not wasting any resources until it's notified by the other thread that there is something in the queue. For instance, Thread 1 will wait for a queue to be filled with something like that:

  1.  
  2. private AutoResetEvent _mySignal = new AutoResetEvent(false);
  3.  
  4. private void MonitorQueue()
  5. {
  6. while(!monitorQueue)
  7. {
  8. _mySignal.WaitOne();
  9. GetItemsInTheQueueAndDoStuff();
  10. }
  11. }

And thread 2 will inject data in the queue:

  1.  
  2. public void EnqueueItem(Item myItem)
  3. {
  4. InsertInQueue(myItem);
  5. _mySignal.Set();
  6. }
  7.  

As I said, my problem is that I used a ManualResetEvent instead of an AutoResetEvent. These 2 classes are almost the same except that when you use ManualResetEvent, you need to reset the signal manually with mySignal.Reset();. In my case, instead of blocking on mySignal.WaitOne(); the code was constantly looping and using a lot of CPU!

I fixed it and reran the load testing: now I'm at 5000 messages per seconds at 40% CPU. Much better!

Filed in .NET | One response so far

lock(this): don’t!

Julien on Apr 4th 2008

I've seen that kind of things in several codebases recently:

  1. lock(this)
  2. {
  3. // Do stuff...
  4. }

Even if it perfectly works, this is a bad idea. You should never (or at least I can't think of a good reason!) lock on a public type, therefore including "this". There is a simple reason: you don't know what expectations the caller is doing.

Let's take a simple example. In the following code, we have a class that uses a lock on this (ClassThatLocksItself) and another class that is going to call it (CallerClass). When CallerClass calls ClassThatLocksItself, it's going to lock on the instance of ClassThatLocksItself.

  1. class Program
  2. {
  3. static void Main(string[] args)
  4. {
  5. ClassThatLocksItself myObj = new ClassThatLocksItself();
  6.  
  7. CallerClass caller = new CallerClass(myObj);
  8. caller.LockTheObjectInAThread();
  9. Thread.Sleep(500);
  10.  
  11. myObj.LockMe();
  12. }
  13. }
  14.  
  15. class CallerClass
  16. {
  17. private ClassThatLocksItself _myObj;
  18. public CallerClass(ClassThatLocksItself myObj)
  19. {
  20. _myObj = myObj;
  21. }
  22.  
  23. public void LockTheObjectInAThread()
  24. {
  25. ThreadPool.QueueUserWorkItem(LockTheObject);
  26. }
  27.  
  28. public void LockTheObject(object state)
  29. {
  30. Console.WriteLine("Acquiring lock on the object");
  31. lock (_myObj)
  32. {
  33. Thread.Sleep(100000); // Do a long computation
  34. }
  35. Console.WriteLine("Releasing lock on the object");
  36. }
  37. }
  38.  
  39. public class ClassThatLocksItself
  40. {
  41. public void LockMe()
  42. {
  43. Console.WriteLine("ClassThatLocksItself -- Trying to acquire lock on this");
  44. lock (this)
  45. {
  46. Console.WriteLine("ClassThatLocksItself -- lock on this acquired");
  47. }
  48. Console.WriteLine("ClassThatLocksItself -- lock on this released");
  49. }

If you try to execute this code, you'll notice that Console.WriteLine("ClassThatLocksItself -- lock on this acquired"); is only executed when LockTheObject() returns (here it takes 100 seconds). By using lock(this), you became dependant on externals and unknowns factors (in that case, the caller of your code decided to lock on your class). The situation is even a lot worse for developers who want to reuse your ClassThatLocksItself: they have no way of knowing that there can be a synchronisation problem unless they read the code of your class!

Now, try to guess what is going to happen if you change the implementation of LockTheObject with the following:

  1. public void LockTheObject(object state)
  2. {
  3. Console.WriteLine("Acquiring lock on the object");
  4. lock (_myObj)
  5. {
  6. _myObj.LockMe(); // will not lock!
  7. }
  8. Console.WriteLine("Releasing lock on the object");
  9. }

Most of you will bet on a deadlock I guess. However, the CLR is intelligent enough to detect that when LockMe executes lock(this), the lock was already acquired by the caller. Therefore, it doesn't block on it. It makes the whole lock(this) thing very subtile: you'll only see a problem in some specific cases.

Continue Reading »

Filed in .NET | One response so far

Yet another blogger on earth!

Julien on Apr 4th 2008

Hi guys, welcome on this new blog!

Like many people, I learned a lot by reading posts from various very good writers (Think Jeremy Miller, JP Boodhoo, Martin Fowler, and dozens of others!). Since I discovered their blogs, their knowledge has always been useful to me. Even if I don't know them personally, they helped me a lot in improving my skills as a developer. In a few words, I owe them a lot!

Starting this blog is a way to try repaying my debt to the community as a whole at my humble level. As a matter of fact, I feel that there's still a lot to do to improve the way we do software development. Even if there are some excellent books and articles available on internet, most people I've been working with have always been foreigners to topics such as design patterns, testing, agiles methodologies, etc. Even if I don't have the expertise of all the people above (and I also don't claim to master any of these topics myself!), hopefully, I'll be able to convey good practices and ideas too. Who knows, maybe I'll even make a small difference around me! :)

Let me also give you a quick background about myself. I'm a French guy working as a .NET developer in the financial industry. I spent the last year in London working for a hedge-fund, and I moved back to Paris in February where I started a new job in a consulting company. Before that, I was studying computer science in a French engineering school which also lead me to do several internships. I also started a company during that time but it's an old story! Bottom line: I'm still very young and I still have a lot to learn!

Finally, while I'm here, I also apologize in advance for all the spelling mistakes that I'll do in my posts. Please feel free to correct me whenever you spot one (that is probably every 5 words!).

See you!

By the way, I'm also opening a french version of this blog here: www.thedotnetfrog.fr

Filed in various stuff | One response so far