More UTF-8 head-thumping with Hibernate 3
I’ve just finished upgrading an application component so that it uses Hibernate 3 instead of Hibernate 2. The last time I tried to do this, I spend half a day on it, realised that all of my UTF-8 encoded data wasn’t working and simply abandoned the effort. But I was feeling brave, so I tried again.
First off, since I’m running on OS X, the first thing that I had to learn is that Firefox on OS X doesn’t handle a lot of Indic text properly (notice that this bug is just over three years old!). That makes it really hard to test when you are looking at question marks instead of Tamil! Solution: use Safari. It works fine. A good test page is the OpenOffice Tamil intro.
But it still wasn’t working for me. I resorted back to the techniques that I used in an earlier blog post, specifically md5 checksums of the text in question. And, yes, there was definitely a problem.
The solution: you need extra parameters for your connection string when using Hibernate 3:
jdbc:mysql://localhost:3306/mydb?autoReconnect=true&useUnicode=true&characterEncoding=UTF-8
… and now things work again (well, in Safari, anyway).