Clojure: Maps Are Not The Answer

I recently attended a meeting by the The London Clojure Group. During this meeting, Robert Rees presented a lightning talk (available online thanks to Skillsmatter) on his newfound love for plain old maps when writing code in languages like Clojure and Python.

I find it really interesting how the development community is exploring different ways to write software; but this suggestion sounds to me like we are going backwards, and it is also quite dangerous. Let me elaborate.

Maps are Unsound

To try and understand my issues with using maps as your core abstractions, let’s examine the following example:

user> (defn is-anonymous? [person]
  (nil? (:name person)))

user> (def person {:name "Phil Calçado"})
(def fruit {:type "banana"})

user> (is-anonymous? person)
user> (is-anonymous? fruit)

If we pass a fruit to our function it will be treated as an anonymous entity. The problem here is that the question asked doesn’t make sense to begin with! Instead of complaining about being asked something that doesn’t make sense, the system just returns some arbitrary answer; which will go unnoticed and bite you in the back at some later stage.

This is the same kind of error you have in an weakly typed language, like PHP:

pcalcado@ziege ~$ php -a                                                                                                                                 
Interactive mode enabled
$age = 28;
$name = "Phil Calçado";
echo "Dear," . $name . "\nNext year you will be: " . ($age + 1 ) . "\n";
echo "Dear," . $name . "\nNext year you will be: " . ($name + 1 ) . "\n";
Dear,Phil Calçado
Next year you will be: 29
Dear,Phil Calçado
Next year you will be: 1
pcalcado@ziege ~$ 

There is an obvious mistake in the program above, but PHP will just carry on. This is bad.

As we discussed before, this has nothing to do with Dynamic vs. Static typing. Let’s see how Ruby deals with this situation:

pcalcado@ziege ~$ irb
ruby-1.9.2-p180 :001 > age = 28
 => 28 
ruby-1.9.2-p180 :002 > name = "Phil Calçado"
 => "Phil Calçado" 
ruby-1.9.2-p180 :003 > print "#{name},\nNext year you will be #{age+1}"
Phil Calçado,
Next year you will be 29 => nil 
ruby-1.9.2-p180 :004 > print "#{name},\nNext year you will be #{name+1}"
TypeError: cant convert Fixnum into String
        from (irb):5:in '+'
        from (irb):5
        from /Users/pcalcado/.rvm/rubies/ruby-1.9.2-p180/bin/irb:16:in '<main>'
ruby-1.9.2-p180 :006 >  

Ruby and Clojure are strong typed languages, they will try to stop you from shooting yourself in the foot. PHP is weak (or loose), which in this example means it will let me proceed even if there is a clear problem in the way I am using my types.

By using maps this way in Python, Ruby or Clojure you are pretty much transforming those languages in PHP.

Data Structures in Clojure

I started doing some Clojure back in 2008. My first pet-project was a testing framework, which eventually became the –long deprecated– RTFSpec library. The first real project using RTFSpec was when I was writing an application as part of the alpha testing programme for Google AppEngine for Java.

Back then, people used mostly struct-map and lists, and RTFSpec followed this pattern. While writing the GAE/J application, though, I found many subtle bugs in the testing infrastructure –and bugs in your testing infrastructure are really painful to detect. If I recall correctly, the problem was around these two structures:

(defstruct specification-result :specification :results :status)
(defstruct specification-list-results  :specifications :results :status)

The first structure represented the results for a single test case, the second is a summary of all test cases. Because they are similar in shape, it was too easy to use one instead of the other in function calls. Eventually I would trace a NullPointerException back to some code trying to call :specifications on specification-result. That was expected to be always a list –even if empty– but, because those were maps, the call was returning null.

My solution to the problem was to quickly hack some dynamic typing layer, which eventually made it to its own pet-project –also deprecated– Struct-quack. What struct-quack did was just to throw an exception when you tried to get a key not defined for that structure, emulating the behaviour we had in the Ruby example above.

I replaced all my struct declarations with struct-quack calls, and ran my tests again. The tests failed with many violations I didn't even know of. I spent the rest of the day fixing the problems and started using struct-quack for all my projects.

Using Maps Properly

So maps are evil? No! Seasoned Clojure programmers use maps all the time.

The best use case for maps is to create composable Input/Output for functions. For composability, untyped key-value pairs are a very good solution –that’s the approach used in Map/Reduce systems like Hadoop.

Maps also work great when defining the protocol for higher-order functions – Ring is a good example.

When you are creating an abstraction which will be long-lived, like my test-results structure above, you should consider using of the many good options of data types and protocols made available by the language.