![]() |
|
|
|
#1 |
|
Guest
Posts: n/a
|
Detect non-standard characters in string
Hi
I have a project to take a MS Word doc and reformat the text into text files that are built into my App. The only issue I have is some time there are some characters in MS Word that are not printable when viewed in Notepad. I usually catch by looking at the text in my App. Usually the problem is an extra long hyphen -- a dagger + Usually when I debug the string I see a squareblock in the string Is there someway to trap the characters that will be not printable/viewable in say notepad???? Thanks |
|
|
#2 |
|
Guest
Posts: n/a
|
RE: Detect non-standard characters in string
You could probably use Char.IsSymbol() in this case
-- Browse http://connect.microsoft.com/VisualStudio/feedback/ and vote. http://www.peterRitchie.com/blog/ Microsoft MVP, Visual Developer - Visual C# "sippyuconn" wrote: > Hi > > I have a project to take a MS Word doc and reformat the text into text files > that are > built into my App. > > The only issue I have is some time there are some characters in MS Word that > are not printable when viewed in Notepad. I usually catch by looking at the > text in my App. Usually the problem is > an extra long hyphen -- > a dagger + > > Usually when I debug the string I see a squareblock in the string > > Is there someway to trap the characters that will be not printable/viewable > in say notepad???? > > Thanks > > |
|
|
#3 |
|
Guest
Posts: n/a
|
Re: Detect non-standard characters in string
I would just check against each numeric character value to see if the
character is outside the range of ASCII characters. Most likely, what is happening is that the text is being placed on the clipboard as unicode, but then when you try to paste it into notepad (which is using ASCII), it does it's best by using the square character to indicate that it couldn't perform a conversion. -- - Nicholas Paldino [.NET/C# MVP] - mvp@spam.guard.caspershouse.com "Peter Ritchie [C# MVP]" <PRSoCo@newsgroups.nospam> wrote in message news:1D0371FA-7F89-4ACF-B0F0-99F8127A0368@microsoft.com... > You could probably use Char.IsSymbol() in this case > > -- > Browse http://connect.microsoft.com/VisualStudio/feedback/ and vote. > http://www.peterRitchie.com/blog/ > Microsoft MVP, Visual Developer - Visual C# > > > "sippyuconn" wrote: > >> Hi >> >> I have a project to take a MS Word doc and reformat the text into text >> files >> that are >> built into my App. >> >> The only issue I have is some time there are some characters in MS Word >> that >> are not printable when viewed in Notepad. I usually catch by looking at >> the >> text in my App. Usually the problem is >> an extra long hyphen -- >> a dagger + >> >> Usually when I debug the string I see a squareblock in the string >> >> Is there someway to trap the characters that will be not >> printable/viewable >> in say notepad???? >> >> Thanks >> >> |
|
|
#4 |
|
Guest
Posts: n/a
|
Re: Detect non-standard characters in string
I don't know how the OP has configured notepad or Word ; but notepad supports
Unicode. The "square character" could be the glyph that is displayed for a Unicode character not supported by the current font. Char.IsSymbol should still catch it, at least in the case of dagger and em dash. I don't know what most fonts are like for support of "printable" characters; but it does depend on the font what is "printable/viewable". -- Browse http://connect.microsoft.com/VisualStudio/feedback/ and vote. http://www.peterRitchie.com/blog/ Microsoft MVP, Visual Developer - Visual C# "Nicholas Paldino [.NET/C# MVP]" wrote: > I would just check against each numeric character value to see if the > character is outside the range of ASCII characters. Most likely, what is > happening is that the text is being placed on the clipboard as unicode, but > then when you try to paste it into notepad (which is using ASCII), it does > it's best by using the square character to indicate that it couldn't perform > a conversion. > > -- > - Nicholas Paldino [.NET/C# MVP] > - mvp@spam.guard.caspershouse.com > > "Peter Ritchie [C# MVP]" <PRSoCo@newsgroups.nospam> wrote in message > news:1D0371FA-7F89-4ACF-B0F0-99F8127A0368@microsoft.com... > > You could probably use Char.IsSymbol() in this case > > > > -- > > Browse http://connect.microsoft.com/VisualStudio/feedback/ and vote. > > http://www.peterRitchie.com/blog/ > > Microsoft MVP, Visual Developer - Visual C# > > > > > > "sippyuconn" wrote: > > > >> Hi > >> > >> I have a project to take a MS Word doc and reformat the text into text > >> files > >> that are > >> built into my App. > >> > >> The only issue I have is some time there are some characters in MS Word > >> that > >> are not printable when viewed in Notepad. I usually catch by looking at > >> the > >> text in my App. Usually the problem is > >> an extra long hyphen -- > >> a dagger + > >> > >> Usually when I debug the string I see a squareblock in the string > >> > >> Is there someway to trap the characters that will be not > >> printable/viewable > >> in say notepad???? > >> > >> Thanks > >> > >> > > > |
|
|
#5 |
|
Guest
Posts: n/a
|
Re: Detect non-standard characters in string
"sippyuconn" <sippyuconn@newsgroup.nospam> wrote in message
news:A5E20BD6-EC9A-4007-B39D-21F05DBA6C68@microsoft.com... > Hi > > I have a project to take a MS Word doc and reformat the text into text files > that are > built into my App. > > The only issue I have is some time there are some characters in MS Word that > are not printable when viewed in Notepad. I usually catch by looking at the > text in my App. Usually the problem is > an extra long hyphen -- > a dagger + > > Usually when I debug the string I see a squareblock in the string > > Is there someway to trap the characters that will be not printable/viewable > in say notepad???? > You need to use an Encoding object obtained via the Encoding.GetEncoding static method. This method allows you to specify the EncoderFallBack class to use (this defaults to the EncoderReplacementFallback which simply replaces un-encodable chars with ?). By supplying the EncoderExceptionFallback object instead then when using the Encoding to convert your content any out-of-band characters will cause an EncoderFallbackException to be thrown. The EncoderFallbackException has properties that you can use to discover what character caused the problem and where it is. -- Anthony Jones - MVP ASP/ASP.NET |
|
|
#6 |
|
Guest
Posts: n/a
|
Re: Detect non-standard characters in string
I agree with Anthony here.
Some more references: #I'm not a Klingon : Best Fit in WideCharToMultiByte and System.Text.Encoding Should be Avoided http://blogs.msdn.com/shawnste/archi...19/515047.aspx #Fallback Encoding Application Sample http://msdn2.microsoft.com/en-us/lib...00(VS.80).aspx Hope this helps. Regards, Walter Wang (wawang@online.microsoft.com, remove 'online.') Microsoft Online Community Support ================================================== When responding to posts, please "Reply to Group" via your newsreader so that others may learn and benefit from your issue. ================================================== This posting is provided "AS IS" with no warranties, and confers no rights. |
|
|
#7 |
|
Guest
Posts: n/a
|
Re: Detect non-standard characters in string
I agree with Anthony here.
Some more references: #I'm not a Klingon : Best Fit in WideCharToMultiByte and System.Text.Encoding Should be Avoided http://blogs.msdn.com/shawnste/archi...19/515047.aspx #Fallback Encoding Application Sample http://msdn2.microsoft.com/en-us/lib...00(VS.80).aspx Hope this helps. Regards, Walter Wang (wawang@online.microsoft.com, remove 'online.') Microsoft Online Community Support ================================================== When responding to posts, please "Reply to Group" via your newsreader so that others may learn and benefit from your issue. ================================================== This posting is provided "AS IS" with no warranties, and confers no rights. |
| Thread Tools | |
| Display Modes | |
|
|
< Home - Windows Help - MS Office Help - Hardware Support >
| New To Site? | Need Help? |