Sometimes we see such character on the browsers which are not readable by the human being, those character are because the content presented has been scraped from other websites and some characters encoding are unpredictable by the browser.
Such characters are shown as question marks inside black diamonds: �. UTF-8 encoding can’t understand such characters, really such characters has nothing meaning in real reading purpose. When we represent content from other sites on our site we face such problems.
There are a lot of suggestions proposed to solve this problem but really there is no way to convert such character in readable form, so one solution is to remove such characters. We can’t replace such character from a string we don’t have such characters on our keyboard.
There are some steps to solve such problems.
1st Step: Such character has a numeric value in ASCII encoding and when we convert such characters into ASCII they are converted into question mark (?). So first convert the string into ASCII encoding. In PHP we have a function mb_convert_encoding($string,$encoding) to convert encoding.
2nd Step: Replace question mark (?) with empty. In PHP we have a function str_replace() to do this.
3rd Step: Convert string back to UTF-8 encoding.
$str = “I am � fine.”;
$str = mb_convert_encoding(str_replace(‘?’,’’,mb_convert_encoding($str,’ASCII’)),’UTF-8’);
// Output :: I am fine.