My apologies again, everyone. I realized that my import of the correctly formatted unicode-compliant XML backup of my blog entries, comments, etc, actually created new page addresses so that all links to my pages were broken. How rude!

So, I’ve reloaded a database copy with the right numbers (fortunately I’ve kept them), and will deal with the unicode mess later, so all the Greek and Hebrew texts are messed up at the moment, temporarily. I noticed that some early posts on the blog are now out of order, too, which is weird, and I can’t tell when that happened, but it must’ve been in one of my earlier moves.


  1. Hi Kevin.

    I’m guessing WordPress use perl to do their conversion/etc.; and that’s unfortunate because (at least versions 5.8 and previous) will screw up unicode Greek (and other unicode characters) every time they’re read in from a text file.

    If, however, the data is sucked in via XML (like RSS) then, depending on the parser used, accessing the unicode chars from the XML DOM, even with perl 5.8 and previous, seems to work just fine (I sling Greek back & forth using the MSXML DOM and perl daily).

    Kinda weird, but I’m guessing that’s the root of your problem with importing old unicode data.

    – Rick

  2. It’s not that, though that’s good to know. The WordPress part of it all works just fine (except importing creates new database row ID numbers, which breaks all links). It’s the actual SQL import that’s not working for me. Get this: the graphic phpMyAdmin tool is set up in such a way that it’s not unicode compliant. I’ve got a ticket in about it. The connection is set to unicode_swedish (!) rather than utf8 so that SQL commands with any unicode in them never transmit all characters properly (including my imports and manually constructed queries including Greek, Hebrew, and accented Latin characters not in the Swedish subset), the GUI is set to Latin1 (so I can’t even read the unicode stored in the database), and some other problems. So, it’s really a server-side misconfiguration (which I can’t fix myself — the hosting service will have to do it).

    At this point, I’ve resigned myself to copying the text from one of my saved backups back into every single post. It’s time-consuming and very frustrating.

    And then there’s the peculiar instance that my earliest posts are out of order. That is, the original dates are preserved, but their ID/page numbers are all mixed up. I don’t know when that happened or how. All the later stuff seems fine. I think that might’ve happened between the times when I switched from blogger to wordpress, and I never noticed it until yesterday. So, my first post “Beginnings” is not actually the first anymore!


Leave a Comment

Your email address will not be published. Required fields are marked *