Objects vs. data structures

Several popular statically typed programming languages, including C++, Java and C#, have a serious design flaw. They make no useful distinction between objects in the OOP sense and plain data structures.

A plain data structure should only be a dumb container for data. It should not have any behaviour, only simple accessors for data items in it. It should not be possible to override data accessors. It should be clear and transparent what it does.

Good old C has a working construct for plain data structures, the struct.

C++, even though trying to be compatible with C, managed to mess it up. C++ added class for OOP objects, but also extended struct to be more or less equivalent to class. Even though there seems to be a convention in C++ to use struct for plain data structures and class for OOP objects, there is no guarantee that a struct in fact is just a plain data structure without surprising behaviour.

Java made it worse by skipping struct completely and providing no other construct for plain data structures than class which is primarily made for OOP objects. In the Java community, there is no general convention for making any distinction. To make things even worse, the horribly verbose Java Beans convention is quite popular and has support in the standard library.

C# actually have a meaningful difference between struct and class, making struct more suitable for plain data structures. However, it seems like struct have too much OOP like functionality built in and that it is possible to provide surprising behaviour.

Scala has class for OOP objects, but does also provide case classes which are suitable for plain data structures. Case classes also have some bonus features like pattern matching and auto generated methods for equality, hash code and string representation. Together with Scala language features like named parameters and immutability by default, they work reasonably well as plain data structures. However, I wish they could be more restrictive and not allow OOP features like general inheritance (which does not work well for case classes anyway) and overriding accessors.

It is possible to use associative arrays as data structures, and then you get no surprising behaviour. This is popular in Javascript which have nice syntax for this. However, most other languages do not have a convenient syntax for this, and in particular it does not work well in a statically typed language. I like to declare my data structures, and it should not be possible to add arbitrary values to instances of them at runtime. Associative arrays usually also have a runtime performance overhead.

It is also possible to use tuples (which some languages have) as data structures, but I want the items in them to be named, not ordered.

Is my only choice to use C?

Updated: mentioned that it is about statically typed languages

Update: Java finally fixed this by adding record.

This entry was posted in Java, programming. Bookmark the permalink.

2 Responses to Objects vs. data structures

  1. Jörgen says:

    I guess you specifically mean typed languages, that you would like a typed data structure? Could be emphasized more in the beginning of the text? From a non explicitly-typed background it is not so important:
    “Several explicitly typed popular programming languages, including C++, Java and C#, have a serious design flaw. “

  2. Yes, maybe it is only relevant for statically typed languages (I think that’s the correct term, rather than explicitly typed).


Comments are closed.