PHP Character encoding in Polish

Just came across an annoying problem when reading and displaying a Polish XML file.
The default character encoding functions don’t work because these characters consist of 2 bytes instead of 1.
So mb_string functions to the rescue. (The mb stands for multi-byte)
Here’s the solution:

mb_internal_encoding('UTF-8');
mb_regex_encoding('UTF-8');
$tmp_info = mb_ereg_replace('Ó', 'Ó', $tmp_info);
$tmp_info = mb_ereg_replace('Ó', 'ó', $tmp_info);

Note that there are HTML entities for those characters as well (Ó and ó respectively), but those are for HTML only. Since I needed to export to XML, they didn’t work.

A nice list of both HTML and XML entities can be found at W3Schools.

No Responses to “PHP Character encoding in Polish”.

Leave a response