What a nightmare… I was trying to make web content that used various extended characters (like the Euro, etc.) work with an email distribution system (Lyris) which has a way to go out and grab a file. I wrote the code to create these files, and we automated what was previously a manual and tedious process.

HOWEVER…

Things were getting screwed up in the email. We could browse just fine, however the email distribution process would scramble things up. So I went digging. With the help of a buisiness collegue (he realized we needed entity references in some cases), I hapilly added the:

HttpUtility.HtmlEncode(strDataToEncode);

 

Unfortunately Microsoft does not include the Euro symbol (€) in their source (based on the Mono code I saw). So I had to fix this.

I found the source code and modified it to handle this one case where Microsoft does not handle the character.

private static String EncodeContent(string strDataToEncode)

{

if (!String.IsNullOrEmpty(strDataToEncode))

{

const int CONST_BEYOND_7BIT_CHAR = 127;

StringBuilder output = new StringBuilder();

 

foreach (char c in strDataToEncode)

switch (c)

{

// I had to add this section for the Euro

case ‘€’:

output.Append(“€”);

break;

case ‘&’:

output.Append(“&”);

break;

case ‘>’:

output.Append(“>”);

break;

case ‘<’:

output.Append(“&lt;”);

break;

case ‘”‘:

output.Append(“&quot;”);

break;

default:

 

Int32 iCharValue = Convert.ToInt32(c);

 

if (iCharValue > CONST_BEYOND_7BIT_CHAR)

{

output.Append(“&#”);

output.Append(iCharValue.ToString());

output.Append(“;”);

}

else

output.Append(c);

break;

}

 

return output.ToString();

}

else

return “”;

}

 

Here is the code from Novell which inspired this (and which I reference in my documentation):

public static string HtmlEncode(string s)

{

if (s == null)

return null;

 

StringBuilder output = new StringBuilder();

 

foreach (char c in s)

switch (c)

{

case ‘&’:

output.Append(“&amp;”);

break;

case ‘>’:

output.Append(“&gt;”);

break;

case ‘<’:

output.Append(“&lt;”);

break;

case ‘”‘:

output.Append(“&quot;”);

break;

default:

 

if (c > 159)

{

output.Append(“&#”);

output.Append(((int)c).ToString(CultureInfo.InvariantCulture));

output.Append(“;”);

}

else

{

output.Append(c);

}

break;

}

return output.ToString();

}

 

So you might be thinking.. Why do the encoding for characters start at 128 with your code and 160 with Mono? I am frankly not sure and would love any and all feedback.

Kind Regards,
Damon Carr

One Comment

  1. I am just so pissed about .NET and C#. It’s just without words.

    Emerging the year 2010 and this method doesn’t know the euro and trademark symbol.

    Just one of many thing which are completeley messed up.

    Beginning from make unvalid strict pages by .NET design to thousands of other things….


Post a Comment