In most object-oriented languages, there is a very specific time when an object constructor is called (namely, when an object is instantiated) and when its destructor is called (namely, when it falls out of scope).
In C#, they have taken the "garbage collection" paradigm one step too far. Not only does memory management rely on it, but even the object destructor is called "somewhen", at an unpredictable time! In a previous version of this document I wrote "somewhen after the object falls out of scope", but it turns out to be even worse, so I'll devote a separate section to the gruesome truth below.
In any case, this means that handy constructs such as an AutoLock
can no longer work (example in C++):
class AutoLock { public: AutoLock(Mutex& m): m_mutex(m) { m_mutex.Lock(); } ~AutoLock() { m_mutex.Unlock(); } private: Mutex& m_mutex; }; |
In a typical Microsoft-way of thinking, they added a "special case" for this particular example by means of the
lock
keyword (which, incidentally, would be trivial to simulate in C++ should you like the
keyword-taste of it). However, other "automatic" resource management using object lifetime (for example, for
handles, GDI object, etc.) still won't work.
To resolve this, objects can implement the IDisposable
interface, which has a Dispose()
method. When object lifetime is important to you, you should put the relevant cleanup code in the Dispose()
implementation and remember to call Dispose()
on the object yourself.
It is good practise to have the destructor call Dispose()
on the object too, but as Professional C#, 2nd Edition
puts it: "The destructor is only there as a backup mechanism in case some badly behaved client
doesn't call Dispose()
" (emphasis mine). You see, only badly behaved clients would forget to
clean up after themselves, so I guess only badly behaved clients would need a garbage collector in the first
place, right?
The proposed "better solutions" for this are the using
keyword, like so:
using (AutoLock theLock = new AutoLock(m_lock)) { // your protected code here } |
finally
clause (which is often recommended over the using
statement), like so:
AutoLock theLock = new AutoLock(m_lock); try { // your protected code here } finally { theLock.Dispose(); } |
lock
keyword) I still need to remember typing
Dispose()
by hand.
And it gets worse! Even program termination doesn't trigger proper cleanup. You can verify this with the following program:
using System; using System.IO; class TestClass { static void Main(string[] args) { StreamWriter sw = File.CreateText("C:\\foo.txt"); sw.WriteLine("Hello, World?"); // Note: We "forget" sw.Close(). // Incidentally, StreamWriter.Dispose(bool) is protected, so we can't call it directly. } } |
foo.txt
file will be created, but it will be empty. Note that even C specifies that all
unflushed data is written out, and files will be closed, at program termination. And even if I did remember
to call Close()
myself (I wouldn't want to be a badly behaved client, now would I?), this wouldn't be
exception-safe. I am supposed to remember to use using
, or litter my code with finally
blocks.
I wrote above that I initially thought that objects are destroyed "somewhen after they go out of scope", but in reality it seems to be far, far worse. As it turns out, the JIT compiler can do "lookahead optimization", and may mark any object for collection after what it considers it's "last use", ignoring scope!
I have had a colleague ask me about the following code:
{ ReadAccessor access(image); IntPtr p = access.GetPtr(); // lengthy piece of code here doing stuff with the pixels from the image } |
A ReadAccessor
is an object which provides access to the pixel data in an image,
which is stored in a memory mapped file for performance reasons. When a ReadAccessor
is constructed, it maps in the memory, after which you can call GetPtr()
to get at
the actual data. Once it goes out of scope, it unmaps the memory again. So, the "validity" of the
data is guaranteed for the lifetime of the ReadAccessor
.
Incidentally, there is also a WriteAccessor
, which makes sure that there be only a
single writer at any given time. Of course, people using this code in C# quickly found out that
they had to dispose of these WriteAccessor
s manually, because otherwise they'd
get the error that this WriteAccessor
would still be sitting in the garbage bin while
they were trying to acquire a new one. But that's the problem mentioned in the item above. This
one is far, far worse.
The colleague told me that his code crashed somewhere in the pixel-processing code.
It took me a while to figure out what was happening: The JIT optimizer looked ahead a little bit,
decided that access
wasn't being used after the GetPtr()
call, and
marked it for collection. Later on in the code, in the same scope, mind you, the GC
apparently decided it was a good time to destroy the ReadAccessor
, which unmapped the
memory still being used by the code.
I still find this hard to believe (even C# can't be this stupid), but the crash went away by modifying the code like so:
{ ReadAccessor access(image); IntPtr p = access.GetPtr(); // lengthy piece of code here doing stuff with the pixels from the image System.GC.KeepAlive(access); } |
This particular item is so mind-boggling that I hope some dear reader can tell me it's just a bad dream and scope is, in fact, honored by the GC.
C# imposes an object-oriented paradigm and enforces it by prohibiting the definition of stand-alone functions: every function must be a member of a class.
If you take object-orientation to the extreme, you would not say
float b = sin(a); |
float b = a.sin(); |
This is clearly unpractical. (Ignore the question of how you would take the sine of a number instead of a variable.)
C# (and Java, for that matter) still try to go about half-way there by making the sine function a member of
the Math
class (or namespace — I can never tell them apart in C#):
float b = Math.sin(a); |
Math
class
(which I can't, because it's sealed
) or put up with the strange distinction that I need to write
float h = Math.sqrt(a*a + b*b); |
float h = MyMath.hypot(a, b); |
It gets even more scary if you look at the OracleNumber
class, which also has
a sin
method. Luckily, it's static
, and you can't call static
member functions on instances.
This is related to the following item, but that is bad enough that I think it warrants its own item:
The popular ArrayList
container (an auto-resizing container, comparable to C++'s vector
template)
has a Sort()
method. And a Reverse()
method. But not a Randomize()
method.
Why should some algorithms be member functions, but not others? The answer is that no algorithms should be
member functions. What if I wanted to use a different sorting algorithm than the one the original implementers
of ArrayList
had in mind?
Note that an ArrayList
sorts itself, while Array.Sort(...)
is a static
member function of the Array
class.
If I decide, late in a project, at the performance-tuning stage perhaps, that I could better use an ArrayList
for some particular collection than the Array
I used up to now, I will likely have to modify my code in
multiple places.
Note that this is not a shortcoming of the language, but it is partly a consequence of item number 2, above.
Vector
, which doesn't overload the comparison operator==
, I can still
write
Vector a, b; if (a == b) { ... } |
operator==
defined for
Vector
s; in C#, this will simply compile, but it means "compare the references a
and b
", i.e. it is true
when a
and b
are the same
Vector
, not when their value is equal. Also, because of the following item, you can't
add such an operator yourself without altering the Vector
class:
In C++, given a class Foo
, you can define an operator for adding two Foo
s without
altering the Foo
class itself:
class Foo {}; Foo operator+(const Foo& lhs, const Foo& rhs) { return Foo(whatever it means to add two Foos); } |
Foo
class itself. Because of the limitation
mentioned in item number 2, above, you cannot make this operator a "free-standing" one. Of course, adding this operator
has nothing to do with the interface to the Foo
class, so you'd probably try something like this:
public class FooOps { public static Foo operator+(Foo lhs, Foo rhs) { return new Foo(whatever it means to add two Foos); } } |
Vector
class without overloaded operators, you'll have
to modify the class itself, also introducing a dependency of your class on the module which happens to implement
these operators.
But wait, there's more.
Note that when you overload the operator==
, you also have to overload operator!=
–
but we'll forgive the compiler for not being able to auto-generate it. It will do a similar "helpful"
trick with arithmetic and bitwise assignment operators – when it most definitely shouldn't.
You cannot overload the arithmetic and bitwise assignment operators +=
, -=
, etc.
Instead, they are evaluated in terms of other operators that can be overloaded. This is exactly the
wrong way around; most C++ programmers implement an operator+
in terms of operator+=
.
Suppose you have a class Image
, representing an image. Also, suppose you have some kind of image
processing library, offering functionality to add two images together. For performance reasons, this library
will likely have separate functions for adding one image in-place, overwriting the old contents, and for
returning a new image containing the result of the addition:
public class ImageProcessing { public static Image Add(Image lhs, Image rhs); public static void AddInPlace(Image lhs, Image rhs); } |
Image
class to offer operators for this,
so they can write code like
Image a, b, c; c = a + b; // really c = ImageProcessing.Add(a, b) a += b; // really ImageProcessing.AddInPlace(a, b) |
Image
class, because you have to modify it for this; in
addition, your Image
class can now not be used without the ImageProcessing
class).
You would think you'd override operator+
for ImageProcessing.Add()
and operator+=
for ImageProcessing.AddInPlace()
, but you can't. Instead, when your client types a += b
, a
whole temporary Image
will have to be constructed, holding the result of the addition, after which the
left operand is replaced with the result. Good bye performance!
If a tree falls down in the woods and there is nobody there to hear it, does it still make a sound? C# has a very interesting view on this popular Philosophy 101 question.
In C#, there is a concept called delegates. A multicast delegate is a set of methods to be called successively when the delegate is called. When the set of methods is empty, trying to call the delegate raises an exception.
However, events are implemented in terms of multicast delegates, too. You declare a delegate
and an event
like so:
public delegate void TreeListener(); class Tree { public event TreeListener Fell; public void Fall() { // Fall down, and make some noise. To be discussed. } } |
class Client { public Client(Tree tree) { tree.Fell += new TreeListener(TreeFell); } private void TreeFell() // This will be called when the tree falls. { Console.WriteLine("I heard it!"); } } |
Tree.Fall()
implementation, you'd simply call the event
:
class Tree { public event TreeListener Fell; public void Fall() { // Fall down, and make some noise: Fell(); } } |
Tree.Fell
event?
In that case, the multicast delegate will be empty, and calling it will raise an exception. You heard it right
(or did you?): Trees simply aren't supposed to fall over when nobody's around.
The suggested solution is to check whether anybody's listening first (if the event
is empty, it
will be null
):
class Tree { public event TreeListener Fell; public void Fall() { if (Fell != null) Fell(); } } |
All in all, I'd strongly reccommend not to use C# for any serious development.
Sander Stoks – Last edit: 31 Jan 2007