Monday, September 08, 2008

Strange characters ’ and  in Wordpress posts

After a wordpress upgrade we started to get all kinds of weird symbols in our posts, including  and ’. I figured it was a character encoding mismatch problem and a quick search on the Wordpress forums confirmed it. You have to comment out two lines in your wp-config.php file (found in your main blog directory). These are the two lines:
define('DB_CHARSET', 'utf8');
define('DB_COLLATE', '');

Comment them out like this:

//define('DB_CHARSET', 'utf8');
//define('DB_COLLATE', '');

Fixed our problem.

54 comments:

Anonymous said...

Thank you so much for the information, it worked :D Just when I thought that I had to edit the whole 500 posts in my database.

Techwiz said...

No problemo!

Something else I have noticed is that Firefox's character encoding can mess things up too.

If you see some of those wild and wacky characters, try going to View > Character encoding. Choose a different character encoding than the one you have now, and see what it does to the web page. So if you are currently set to Western ISO-8859, try UTF-8 and see what happens.

Anonymous said...

Thank you so much for the info! I too thought I'd have to edit my entire blog to remove the annoying character.

dfisher said...

Thank you! I upgraded to 2.7 and had already edited the last 50 posts to remove the weird characters, and then found them back again this morning. Finding this was a big relief.

Mishka said...

Thanks for this, saved my butt! (or at least hours of annoyance trying to figure out what's going on) ;)

Anonymous said...

Many, many thanks for this tip!

John Jr. said...

Hm... I just freshly installed WordPress, started having these very same strange characters show up in posts, found this site, opened up my wp-config.php file, and found it didn't even have those lines you mentioned. How can this be?! Any tips on licking this problem would be mos appreciated!

John Jr. said...

Ooops. Sorry for the follow-up post here, but I forgot to check the "Email follow-up comments to..." button. This time I did!

Techwiz said...

John, Post a link to your site, so we can see the strange characters.

:-)

John Jr. said...

D'oh! That was silly. Yes, here it is in action:

http://www.lasalamander.com/projectography

Of course, as I'm only just starting this site, there are only a couple entries, and I've been fixing the weird characters whenever they show up, so I don't have a lot of great examples. But they do seem to just randomly show up, so who knows. Maybe there'll be a new one by the time you look.

However, I did just spot a new one that showed up--one that I already deleted before but has come back anyway. If you look under the excerpt text of the A Random Test Post entry, you'll see it says "Filed under: 3am, 89°, no wind" That  definitely shouldn't be there.

Also, in the post Captain Chancey, when I originally posted it, the lines showed up like this:

Among his many endeavors, Joshua Wentz does a somewhat improvised music project every once in a while that he calls Winchester Sessions.Â

And somewhere else I remember the word "you'd" got turned into this: you’d

Any ideas, Techwiz? :D

John Jr. said...

Well, of course, by now my previous post has become irrelevant. Not that I found a solution. It just seems to have gone away by itself.

Meanwhile, I've got a new problem which is not at all the same thing, but hey, it can't hurt to ask since I'm totally stuck on a solution:


So I'm new to theme editing but learning quickly. I was trying to figure out how to edit the widgets themselves by tweaking my widget.php file, and then something odd happened: At the top of my all of my site's pages (as well as at the top of the admin pages, somehow!!) there's this line:

it', 'widget_many_register' ); */ ?>

Okay, so my first though is that I messed up my widgets.php file, so I'll just undo the damage by replacing it with the backup copy I made in the event that something like this should happen. And so I do. But...nothing changed! That crazy bit of code is still there. In fact, on my Dashboard page it's also the first line in the Incoming Links, Plugins, Wordpress Development Blog, and Other Wordpress News windows. That's crazy! What on Earth did I do? Any clues?

Thanks!

http://lasalamander.com/blog

Techwiz said...

Hi John,
Sorry I didn't get back to you sooner on the strange characters issue. I am a lazy blogger. So lazy, in fact, that I don't even read the comments on my blog.

Re: your current issue...the wiget_many_register thing is a broken bit of code. It looks like it used to be a bit of inline css style that has been commented out.

Textpad will let you search through a folder of files, use it to find your little broken bit of code. Then hopefully you will see what needs to be done. Delete it or restore it from the original file.

Let us know how it goes.

John Jr. said...

Well I definitely thank you for your willingness to help, Techwiz. I'm so confounded on this issue, and I'm hoping I won't have to just go and reinstall WordPress to fix it. I did find the source of that bit of weird code. It happens to be the very last bit in the widgets.php file. So the very first thing I tried was to restore the original widgets.php file, but it changed nothing! I also went and tried deleting that particular bit of text from the file, but still, it's showing up on the site anyway. Where on earth is it coming from?!

John Jr. said...

See, if you just keep on approving my comments, I'll eventually work it out on my own. That is to say, I fixed it. Or...it fixed it self. I didn't exactly find the solution, but it's all better now. All I did was upgrade WordPress to the latest version and and presto-chango, it fixed it self. Thanks for giving me a forum to post my progress. ;)

Techwiz said...

I *love* this new method of tech troubleshooting. All I have to do is approve comments and problems get solved!
Upgrading, like reinstalling, is a good way to sort out a lot of technical glitches. Glad things are working now.

Adriana said...

Hello. I know this is a super old post but it's the only one that I googled that gave me what I think may be an answer to my problem

I too type and it comes out in strange characters like the ones you shouw. HOWEVER, my problem is not with wordpress, but rather with blogger.

Do you happen to know how I can fix this same problem in blogger?

thanks a million!

Techwiz said...

Post a link, Adriana. :-)

Does it happen to all your posts, or certain ones only.

When did it start happening?

Anonymous said...

Wow! Thanks alot! I accidental deleted my database but i recovered it. When I got it back these strange symbols were all over the place. Your advice worked perfectly!

Gil said...

Thanks for this post, it was very helpful. All my posts had those strange sybols in it. I did notice on my database view with phpmyadmin, there was a new character type property that I didn't think I has seen before. I think it must have something to do with that. Anyway, your fix worked beautifully. Thanks!!!

Anonymous said...

Your fix apparently worked for strange characters that appeared in many posts when I moved a wordpress site from one domain to another account with the same host. Tried all other suggestions, like changing table column to blog, then back to text, etc. Nothing worked until I used your fix of commenting out the two lines in the config file. Thanks.

Unknown said...

.. or you could just change ..

define('DB_CHARSET', 'utf8');

to

define('DB_CHARSET', 'utf16');

Anonymous said...

FanFlippingTastic, I was reading the other horribly techie looking posts on the wordpress forums and thinking, 'whoa, I'm going to need a lot more coffee before I attempt this lot, and then I saw the link to this fix. 30 seconds later and all is well.

Thankyou!!

Unknown said...

This was mighty helpful - thanks for sharing!

Unknown said...

Thank you. Fixed my issue in about 10 secs. Excellent.

Andrew Kronemyer said...

It worked! Thank you so much!

amanda said...

LOL ! You have no idea how much you have helped me, I too, like Sheila would have just had to manually edit 400+ posts hahahaha Thanks so so much!

Anonymous said...

I am so grateful that I cannot begin to explain, I have an e-commerce site with hundreds of posts that all freaked out with these alien symbols.
My client is now releasing the rest of his payment to me. Thank you so so much. You are a freaking Legend!

Techwiz said...

I am glad that this post is still helping people! Thanks for all the nice comments everyone!

rafaelromero said...

thanks, it fixed mine!

Paris Wells said...

Thanks alot , nearly had to go through all manually!

hubbers said...

or if you want to keep your encoding utf8 then you will need to run a query like this one on your database

UPDATE wp_posts SET post_content = replace( post_content, '’', "'" )

be warned that if you have one badly encoded character then you will have loads for " £ ' - _ etc and they will all have to be fixed very carefully.

Chris said...

Perfect! thanks so much! I was looking at doing an export and reimport of the database, but this correct fix is so much simpler :D

Nikp said...

Hi Techwizuy,

Thanks a lot for posting this. I just upgraded and moved a site to WP3 and my ecommerce application had developed this problem.

Saved me a load of time, cheers :)

Nik

Sarah said...

Hello:

I have a WP blog that is doing the same thing. I have comments out the two statement in the wp-config.php-no change and I have also loaed the plugin UTF8 Sanitize
which has not helped either.

The blog is: http://sherrymchenry.com/blog/

You can see the odd characters in the ver first paragraph.

Thanks -Sarah sarahaustin@fuse.net

Jim said...

I'm having the same problem in WordPress 3.0.4. But just in the feed, not in the posts.

Made the changes you recommend, but my feed still has bÂd characters according to FeedValidator.

Jim.MD/Feed

Techwiz said...

Seeing as how this post is two years old and wordpress has gone through several revisions since 2008, I am not surprised that my suggestions aren't working anymore. Try a question on the official Wordpress forums?

Also, some additional details would help. Did this just suddenly start to happen? Did it happen after an upgrade? After a new plugin was installed? Did your hosting company change versions of php or mysql on you?

ra said...

Hey guys, crazy enough, I found that enabling the w3 total cache plugin resolved the issue for me.

Annemieke said...

Thank you so much for this really simple solution, saved me a lot of work!

Techwiz said...

Hi Annemieke, you're welcome, and, you have a really interesting blog.

Anonymous said...

Thank you so much! I spend many hours to find why and you give me the answer :-)

Jennette Fulda said...

Thanks for this post! You just saved me a ton of time.

Pretty Shiny Sparkly said...

Thank you so much for this, I was freaking out about this and it worked miraculously with your fix! Many thanks!!

Anonymous said...

Something else that might help you all so much is if you're structuring your website using PHP, you can surround any output with utf8_decode(CONTENTSTRING HERE); and that will remove all of those pesky little hoozywhatsies

Dustin C said...

Thanks this resolved my problem.

Anonymous said...

This is still very useful, even though a few years have gone by. Thanks for posting!! Funny characters... be gone!

Anonymous said...

You legend. How easy was that! Thank you so much.

Matthew Cobb said...

This fix corrected all my †and similar characters after moving my blog from server A to server B.

Thanks for the tip!

Clayton said...

Hell yes. this worked. I did try to change the database char set in phpmyadmin, so it didn't work at first.

I changed it back to the way it was originally and then did this hack and it fixed it.

Unknown said...

Worked a treat! Thanks

B Mills said...

Thanks man! I'm not exactly a coding guru, but know enough to build a Wordpress site and fumbled my way through a more-complicated-than-normal transfer of the database to a new hosting site. I got all those goofy figures and assumed I was in for a nightmare and would probably have to call up my coding guy. But this was pretty simple and fixed my problem. So yay! Took me all of 1 minute to find, change, and test.

Unknown said...

Still works on WP 4.1

Anonymous said...

I am getting these kind of symbols after updating to Wordpress 4.22. Any idea of how to fix this?
diabetes during th&#1077 first 26 weeks &#959f pregnancy, th&#1077 child

Techwiz said...

Those are Russian and Greek character codes. &#1077 looks like a lower case "e" and &#959 looks like a lower case "o"

So, why are these Greek and Russian character codes in your English copy?
I tried pasting the same Reuters copy into a text editor, Dreamweaver, Blogger, and Wordpress and didn't get any of those character codes.

Is your web browser character encoding set to Unicode or to some other language like Greek or Russian?

Have you tried the instructions I gave, above?


Unknown said...

This post helped me out today. :)