|
|
![]() |
|
|
Index |
|
1 |
Although Visual Basic 6.0 stores strings internally as Unicode(UTF-16) it has several limitations:
The purpose of this tutorial is to resolve these issues and provide working VB code solutions. The level of difficulty of these solutions vary but in general require intimate knowledge of ActiveX Controls and Classes. Subclassing and API programming are a must to gain functionality that Vb does not directly support.
The amount of information gathered during development of Unicode aware controls was so overwhelming that it made sense to organize it and this Tutorial proved to be an ideal place to bring eveything together under one roof.
| Tutorial Development Tools | |
| Microsoft® Frontpage® 2003 | ![]() |
| Microsoft® Platform SDK Feb 2007 Edition for Vista |
![]() |
Visual Basic 6.0 -
|
|
Note: Dates in this tutorial are displayed as dd-mmm-yyyy (Example: 11-Mar-2004) to eliminate any ambiguities.
* These issues are resolved in Vb.Net although you
will have to go through a learning curve to get up to speed with the language.
Review these tables to determine the
minimum system requirements:
|
2 |
This flowchart shows basic program flow of:
![]() |
|
3 |
| Character Set | Range | Codepage |
Byte Order Mark |
||||||||||||
| OEM (DOS) | 0..255 | 437 OEM - United States | None | ||||||||||||
| ANSI (Windows) | 0..255 | 1252 ANSI - Latin I etc. See SBCS-DBCS |
None | ||||||||||||
| EBCDIC (Mainframe) | 0..255 | 1047 IBM EBCDIC - Latin 1/Open System | |||||||||||||
| UTF-8 |
|
65001 |
EF BB BF | ||||||||||||
| UTF-16LE (little endian, low byte first, x86 Processor and Microsoft Windows) | 0..65535(FFFF) - 2 bytes | 1200 | FF FE | ||||||||||||
| UTF-16BE (big endian, high byte first, PowerPC Processor and Mac OS) | 0..65535(FFFF) - 2 bytes | 1201 | FE FF | ||||||||||||
| UTF-32LE (little endian, low byte first) | 0..10FFFF - 4 bytes | 12000 | FF FE 00 00 | ||||||||||||
| UTF-32BE (big endian, high byte first) | 0..10FFFF - 4 bytes | 12001 | 00 00 FE FF | ||||||||||||
| DBCS - Double-Byte Character Set | 0..65535(FFFF) - 2 bytes Chars 0-127 are 1 byte |
See DBCS |
Note: You should have a utility on your system called
CharMap.Exe which will allow you to browse and select Unicode characters.
|
4 |
SBCS(Single-Byte) and DBCS(Double-Byte) Character Sets are different character sets from Unicode.
Character codes for "A" in ANSI, Unicode, and DBCS
| ANSI Character "A" | &H41 |
A |
| Unicode character "A" | &H41 &H00 | A |
| DBCS character that represents a Japanese wide-width "A" | &H82 &H60 | |
| Unicode wide-width "A" | &H21 &HFF | A |
CharSet from Platform SDK
wingdi.h

| Font Character Sets | ||
|
Number |
Name | Info |
| 0 | ANSI_CHARSET | West, Occidental(United States, Western Europe) |
| 1 | DEFAULT_CHARSET | |
| 2 | SYMBOL_CHARSET | Standard symbol charset |
| 77 | MAC_CHARSET | Macintosh |
| 128 * | SHIFTJIS_CHARSET | Shift-JIS (Japanese Industry Standard) |
| 129 * | HANGEUL_CHARSET | Korea (Wansung) |
| 129 * | HANGUL_CHARSET | Korea (Wansung) |
| 130 * | JOHAB_CHARSET | Korea (Johab) |
| 134 * | GB2312_CHARSET | Simplified Chinese - Mainland China(PRC) and Singapore |
| 136 * | CHINESEBIG5_CHARSET | Traditional Chinese - Taiwan and Hong Kong |
| 161 | GREEK_CHARSET | Greek |
| 162 | TURKISH_CHARSET | Turkish |
| 163 | VIETNAMESE_CHARSET | Vietnamese |
| 177 | HEBREW_CHARSET | Hebrew |
| 178 | ARABIC_CHARSET | Arabic |
| 186 | BALTIC_CHARSET | Baltic |
| 204 | RUSSIAN_CHARSET | Cyrillic - Russia, Belarus, Ukraine and some other slavic countries. |
| 222 | THAI_CHARSET | Thai |
| 238 | EASTEUROPE_CHARSET | |
| 255 | OEM_CHARSET | |
* DBCS - Double-Byte Character Set
DBCS is actually not the correct terminology for what Windows uses. It is actually MBCS where a character can be 1 or 2 bytes. To illustrate this consider the following code which will take a Unicode string of English and Chinese characters, convert to a byte array of MBCS Chinese, dump the byte array to the immediate window, and finally convert it back to a Unicode string to display in a Unicode aware textbox. The byte array when converted using Chinese(PRC) LCID = 2052 contains single bytes for the english characters and double bytes for the Unicode characters. This proves that it is MBCS and not DBCS:
| Option Explicit Private Sub Form_Load() Dim sUni As String Dim sMBCS As String Dim b() As Byte Dim i As Long sUni = "2006" & ChrW$(&H6B22) & "9" & ChrW$(&H8FCE) & "12" & ChrW$(&H6B22) & " 8:04" UniTextBoxEx1 = sUni b = StrConv(sUni, vbFromUnicode, 2052) sMBCS = StrConv(sUni, vbFromUnicode, 2052) Debug.Print sUni, sMBCS Text1 = sMBCS For i = 0 To UBound(b) Debug.Print b(i) Next sUni = StrConv(b, vbUnicode, 2052) UniTextBoxEx2 = sUni End Sub |
| UniTextBox1, UniTextBox2: 2006欢9迎12欢 8:04 |
| Debug Window: 50 ' 2 48 ' 0 48 ' 0 54 ' 6 187 ' 182 ' 欢 57 ' 9 211 ' 173 ' 迎 49 ' 1 50 ' 2 187 ' 182 ' 欢 32 ' space 56 ' 8 58 ' : 48 ' 0 52 ' 4
|
The following demos shows how to display Chinese on an English-U.S. system without changing your Regional settings
Example of VbTextBox using DBCS Charset 134, CHINESE_GB2312
Set your TextBox to any Font that supports script CHINESE_GB2312 (Listview1.Font.Charset=134). Example "Arial Unicode MS". Make sure you select the CHINESE_GB2312 script from the Font Dialog box.
Add this DBCS string:
TextBox1 = "CHS: ¡¡ÄãÕæµÄÄܹ»ÔÚÎı¾¿òÖÐʹÓñ³¾°Í¼Æ¬£¬"

Warning: If you apply an XP Theme to the TextBox below via an IDE Manifest it will display ANSI instead of DBCS.
Example of Vb ListView
using DBCS Charset 134, CHINESE_GB2312
Set your Listiew to any Font that supports script CHINESE_GB2312 (Listview1.Font.Charset=134). Example "Arial Unicode MS". Make sure you select the CHINESE_GB2312 script from the Font Dialog box.
In report mode add this DBCS string:
ListItems.Add , , "CHS: ¡¡ÄãÕæµÄÄܹ»ÔÚÎı¾¿òÖÐʹÓñ³¾°Í¼Æ¬£¬"

.
See MSDN for more information about DBCS:
Issues Specific to the Double-Byte Character Set (DBCS)
ANSI, DBCS, and Unicode: Definitions
Calling Windows API Functions
DBCS Sort Order and String Comparison
DBCS String Manipulation Functions
DBCS-Enabled KeyPress Event
Designing an International-Aware User Interface
Font, Display and Print Considerations in a DBCS Environment
Identifiers in a DBCS Environment
Processing Files That Use Double-Byte Characters
The global options of the StrConv function are converting uppercase to lowercase, and vice versa. In addition to those options, the function has several DBCS-specific options. For example, you can convert narrow letters to wide letters by specifying vbWide in the second argument of this function. You can convert one character type to another, such as hiragana to katakana in Japanese. StrConv enables you to specify a LocaleID for the string, if different than the system's LocaleID.
You can also use the StrConv function to convert Unicode characters to ANSI/DBCS characters, and vice versa. Usually, a string in Visual Basic consists of Unicode characters. When you need to handle strings in ANSI/DBCS (for example, to calculate the number of bytes in a string before writing the string into a file), you can use this functionality of the StrConv function.
|
5 |
The easiest way to add Unicode test strings to your project is
to make a resource file with a Unicode aware editor and compile it with RC.exe.
That way you can test your controls without having to cut/paste the strings when
you need them.
Use Notepad if you are on NT or later, WordPad or UltraEdit if using Win9x.
Download the complete resource file with source
here.
"Welcome" in several languages
| Resource ID |
"Welcome" UTF-16 Unicode |
Resource ID |
"Welcome" UTF-8 Encoded |
| 101 | "ARA: مـرحبــاً" | 151 | "ARA: Ù…Ù€Ø±ØØ¨Ù€Ù€Ø§Ù‹" |
| 102 | "CHS: 欢迎" | 152 | "CHS: 欢迎" |
| 103 | "CHT: 歡迎" | 153 | "CHT: æ¡è¿Ž" |
| 104 | "ENG: Welcome" | 154 | "ENG: Welcome" |
| 105 | "GEO: სასურველი" | 155 | "GEO: სáƒáƒ¡áƒ£áƒ ველი" |
| 106 | "GRK: Καλώς ήλθατε" | 156 | "GRK: Καλώς ήλθατε" |
| 107 | "HEB: בִּרוּבִים חַבָּאִים" | 157 | "HEB: ברוכי×? הב×?×™×?" |
| 108 | "HIN: रवागत" | 158 | "HIN: रवागत" |
| 109 | "JPN: よろてそ" | 169 | "JPN: よã?†ã?“ã??" |
| 110 | "KOR: 여보세요" | 160 | "KOR: 여보세요" |
| 111 | "PAN: ਜੀ ਆਇਆਂ ਨੂੰ" | 161 | "PAN: ਜੀ ਆਇਆਂ ਨੂੰ" |
| 112 | "PTB: Bem-vindo" | 162 | "PTB: Bem-vindo" |
| 113 | "RUS: Добро пожаловать" | 163 | "RUS: Добро пожаловать" |
| 114 | "TAM: அங்கிகரி" | 164 | "TAM: à®…à®™à¯?கிகரி" |
| 115 | "THA: การต้อนรับ" | 165 | "THA: à¸à¸²à¸£à¸•้à¸à¸™à¸£à¸±à¸š" |
| 116 | "URD: स्वागत" | 166 | "URD: सà¥à¤µà¤¾à¤—त" |
| 117 | "VIE: tính từ" | 167 | "VIE: tÃnh từ" |
"Hello" in several languages
* Needs Code2000 Font to see this
| Language | "Hello" UTF-16 Unicode |
"Hello" UTF-8 Encoded |
| Arabic | السلام عليكم | السلام عليكم |
| Bengali (বাঙ্লা) | ষাগতোম | ষাগতোম |
| * Burmese | မ္ရန္မာ | (မ္ရန္မာ) |
| Cantonese (粵語,廣東話) | 早晨, 你好 | 早晨, ä½ å¥½ |
| * Cherokee (á£áŽ³áŽ©)ᎣᏏᏲ | ᎣᏏᏲ | Ꭳáá² |
| Chinese (中文,普通话,汉语) | 你好 | ä½ å¥½ |
| Czech (česky) | Dobrý den | Dobrý den |
| Danish (Dansk) | Hej, Goddag | Hej, Goddag |
| English | Hello | Hello |
| Esperanto | Saluton | Saluton |
| Estonian | Tere, Tervist | Tere, Tervist |
| Finnish (Suomi) | Hei | Hei |
| French (Français) | Bonjour, Salut | Bonjour, Salut |
| German (Deutsch Nord) | Guten Tag | Guten Tag |
| German (Deutsch Süd) | Grüß Gott | Grüß Gott |
| Georgian (ქართველი) | გამარჯობა | გáƒáƒ›áƒáƒ ჯáƒáƒ‘რ|
| Gujarati | (ગુજરાતિ) | (ગà«àªœàª°àª¾àª¤àª¿) |
| Greek (Ελληνικά) | Γειά σας | Γειά σας |
| Hebrew | שלום | ×©×œ×•× |
| Hindi | नमस्ते, नमस्कार। | नमसà¥à¤¤à¥‡, नमसà¥à¤•ार। |
| Italiano | Ciao, Buon giorno | Ciao, Buon giorno |
| Japanese (日本語) | こんにちは, コンニチハ | ã“ã‚“ã«ã¡ã¯, コï¾ï¾†ï¾ï¾Š |
| Korean (한글) | 안녕하세요, 안녕하십니까 | 안녕하세요, 안녕하ì‹ë‹ˆê¹Œ |
| Maltese | Ċaw, Saħħa | ÄŠaw, Saħħa |
| Nederlands (Vlaams) | Hallo, Dag Hallo, Dag | Hallo, Dag Hallo, Dag |
| Norwegian (Norsk) | Hei, God dag | Hei, God dag |
| Punjabi | (ਪੁਂਜਾਬਿ) | (ਪà©à¨‚ਜਾਬਿ) |
| Polish | Dzień dobry, Hej | DzieÅ„ dobry, Hej |
| Russian (Русский) | Здравствуйте! | ЗдравÑтвуйте! |
| Slovak | Dobrý deň | Dobrý deň |
| Spanish (Español) | ¡Hola! | ‎¡Hola!‎ |
| Swedish (Svenska) | Hej, Goddag | Hej, Goddag |
| Thai (ภาษาไทย) | สวัสดีครับ, สวัสดีค่ะ | สวัสดีครับ, สวัสดีค่ะ |
| Tamil (தமிழ்) | வணக்கம் | வணகà¯à®•ம௠|
| Turkish (Türkçe) | Merhaba | Merhaba |
| Vietnamese (Tiếng Việt) | Xin Chào | Xin Chà o |
| Yiddish â€(ײַדישע) | דאָס הײַזעלע | ד×ָס הײַזעלע |
Other methods for creating a string at Vb Runtime:
| Sample Output | Vb String |
| ARA: مـرحب | "ARA: " & ChrW$(&H645) & ChrW$(&H640) & ChrW$(&H631) & ChrW$(&H62D) & ChrW$(&H628) |
| ARM: ԱԲԳԴԵԶԷԸԹ | "ARM: " & ChrW$(&H531) & ChrW$(&H532) & ChrW$(&H533) & ChrW$(&H534) & ChrW$(&H535) & ChrW$(&H536) & ChrW$(&H537) & ChrW$(&H538) & ChrW$(&H539) |
| CHS: 欢迎 | "CHS: " & ChrW$(&H6B22) & ChrW$(&H8FCE) |
| CHT: 歡迎 | "CHT: " & ChrW$(&H6B61) & ChrW$(&H8FCE) |
| ENG: Welcome | "ENG: Welcome" |
| GEO: სასურველი | "GEO: " & ChrW$(&H10E1) & ChrW$(&H10D0) & ChrW$(&H10E1) & ChrW$(&H10E3) & ChrW$(&H10E0) & ChrW$(&H10D5) & ChrW$(&H10D4) & ChrW$(&H10DA) & ChrW$(&H10D8) |
| GRK: Καλώς ήλθατε | "GRK: " & ChrW$(&H39A) & ChrW$(&H3B1) & ChrW$(&H3BB) & ChrW$(&H3CE) & ChrW$(&H3C2) & " " & ChrW$(&H3AE) & ChrW$(&H3BB) & ChrW$(&H3B8) & ChrW$(&H3B1) & ChrW$(&H3C4) & ChrW$(&H3B5) |
| HEB: ברוכים הבאים | "HEB: " & ChrW$(&H5D1) & ChrW$(&H5E8) & ChrW$(&H5D5) & ChrW$(&H5DB) & ChrW$(&H5D9) & ChrW$(&H5DD) & " " & ChrW$(&H5D4) & ChrW$(&H5D1) & ChrW$(&H5D0) & ChrW$(&H5D9) & ChrW$(&H5DD) |
| HIN: रवागत | "HIN: " & ChrW$(&H930) & ChrW$(&H935) & ChrW$(&H93E) & ChrW$(&H917) & ChrW$(&H924) |
| JPN: ようこそ | "JPN: " & ChrW$(&H3088) & ChrW$(&H3046) & ChrW$(&H3053) & ChrW$(&H305D) |
| KOR: 여보세요 | "KOR: " & ChrW$(&HC5EC) & ChrW$(&HBCF4) & ChrW$(&HC138) & ChrW$(&HC694) |
| PAN: ਜੀ ਆਇਆਂ ਨੂੰ | "PAN: " & ChrW$(&HA1C) & ChrW$(&HA40) & " " & ChrW$(&HA06) & ChrW$(&HA07) & ChrW$(&HA06) & ChrW$(&HA02) & " " & ChrW$(&HA28) & ChrW$(&HA42) & ChrW$(&HA70) |
| PTB: Bem-vindo | "PTB: Bem-vindo" |
| RUS: Добро пожаловать | "RUS: " & ChrW$(&H414) & ChrW$(&H43E) & ChrW$(&H431) & ChrW$(&H440) & ChrW$(&H43E) & " " & ChrW$(&H43F) & ChrW$(&H43E) & ChrW$(&H436) & ChrW$(&H430) & ChrW$(&H43B) & ChrW$(&H43E) & ChrW$(&H432) & ChrW$(&H430) & ChrW$(&H442) & ChrW$(&H44C) |
| TAM: அங்கிகரி | "TAM: " & ChrW$(&HB85) & ChrW$(&HB99) & ChrW$(&HBCD) & ChrW$(&HB95) & ChrW$(&HBBF) & ChrW$(&HB95) & ChrW$(&HBB0) & ChrW$(&HBBF) |
| THA: การต้อนรับ | "THA: " & ChrW$(&HE01) & ChrW$(&HE32) & ChrW$(&HE23) & ChrW$(&HE15) & ChrW$(&HE49) & ChrW$(&HE2D) & ChrW$(&HE19) & ChrW$(&HE23) & ChrW$(&HE31) & ChrW$(&HE1A) |
| URD: स्वागत | "URD: " & ChrW$(&H938) & ChrW$(&H94D) & ChrW$(&H935) & ChrW$(&H93E) & ChrW$(&H917) & ChrW$(&H924) |
| VIE: tính từ | "VIE: tính t" & ChrW$(&H1EEB) |
Note: Under the hood StrConv inserts a BOM (FEFF) before the CJK Unified Ideographs.
More stuff to play with:
| Nº | Sample | Font.Name | Font.Charset | String |
| 1 | English | Tahoma | ANSI_CHARSET | "English" |
| 2 | româneşte | " | EASTEUROPE_CHARSET | ChrW$(114) & ChrW$(111) & ChrW$(109) & ChrW$(226) & ChrW$(110) & ChrW$(101) & ChrW$(351) & ChrW$(116) & ChrW$(101) |
| 3 | ภาษาไทย |
" | THAI_CHARSET | ChrW$(3616) & ChrW$(3634) & ChrW$(3625) & ChrW$(3634) & ChrW$(3652) & ChrW$(3607) & ChrW$(3618) |
| 4 | Հայերեն | Arial Unicode MS | ChrW$(1344) & ChrW$(1377) & ChrW$(1397) & ChrW$(1381) & ChrW$(1408) & ChrW$(1381) & ChrW$(1398) | |
| 5 | Tiếng Việt |
" |
VIETNAMESE_CHARSET | ChrW$(84) & ChrW$(105) & ChrW$(234) & ChrW$(769) & ChrW$(110) & ChrW$(103) & ChrW$(32) & ChrW$(86) & ChrW$(105) & ChrW$(234) & ChrW$(803) & ChrW$(116) |
| 6 | עברית |
" |
HEBREW_CHARSET | ChrW$(1506) & ChrW$(1489) & ChrW$(1512) & ChrW$(1497) & ChrW$(1514) |
| 7 | मराठी | Arial Unicode MS | ChrW$(2350) & ChrW$(2352) & ChrW$(2366) & ChrW$(2336) & ChrW$(2368) | |
| 8 | 中文 (台灣) | PMingLiU | CHINESEBIG5_CHARSET | ChrW$(20013) & ChrW$(25991) & " (" & ChrW$(21488) & ChrW$(28771) & ")") |
| 9 | नेपाली | Arial Unicode MS | ChrW$(2344) & ChrW$(2375) & ChrW$(2346) & ChrW$(2366) & ChrW$(2354) & ChrW$(2368) | |
| 10 | Русский |
" |
RUSSIAN_CHARSET | ChrW$(1056) & ChrW$(1091) & ChrW$(1089) & ChrW$(1089) & ChrW$(1082) & ChrW$(1080) & ChrW$(1081) |
| 11 | ირუკსაბ | Arial Unicode MS | StrReverse(ChrW$(4305) & ChrW$(4304) & ChrW$(4321) & ChrW$(4313) & ChrW$(4323) & ChrW$(4320) & ChrW$(4312)) | |
| 12 | 日本語 | Arial Unicode MS | SHIFTJIS_CHARSET | ChrW$(26085) & ChrW$(26412) & ChrW$(-30050) |
| 13 | ଉଡିଯା | Arial Unicode MS | ChrW$(2825) & ChrW$(2849) & ChrW$(2879) & ChrW$(2863) & ChrW$(2878) | |
| 14 | Ελληνικά |
" |
GREEK_CHARSET | ChrW$(917) & ChrW$(955) & ChrW$(955) & ChrW$(951) & ChrW$(957) & ChrW$(953) & ChrW$(954) & ChrW$(940) |
| 15 | हिन्दी |
Arial Unicode MS | ChrW$(2361) & ChrW$(2367) & ChrW$(2344) & ChrW$(2381) & ChrW$(2342) & ChrW$(2368) | |
| 16 | 한국어 | GulimChe | HANGEUL_CHARSET | ChrW$(-10916) & ChrW$(-21139) & ChrW$(-14924) |
| 17 | తెలుగు | Arial Unicode MS | ChrW$(3108) & ChrW$(3142) & ChrW$(3122) & ChrW$(3137) & ChrW$(3095) & ChrW$(3137 | |
| 18 | Čeština |
" |
EASTEUROPE_CHARSET | ChrW$(268) & ChrW$(101) & ChrW$(353) & ChrW$(116) & ChrW$(105) & ChrW$(110) & ChrW$(97) |
| 19 | ಕನ್ನಡ | Arial Unicode MS | ChrW$(3221) & ChrW$(3240) & ChrW$(3277) & ChrW$(3240) & ChrW$(3233) | |
| 20 | 中文(中国) | SimSun | GB2312_CHARSET | ChrW$(20013) & ChrW$(25991) & "(" & ChrW$(20013) & ChrW$(22269) & ")") |
| 21 | ગુજરાતી | Arial Unicode MS | ChrW$(2711) & ChrW$(2753) & ChrW$(2716) & ChrW$(2736) & ChrW$(2750) & ChrW$(2724) & ChrW$(2752) | |
| 22 | Türkçe |
" |
TURKISH_CHARSET | ChrW$(84) & ChrW$(252) & ChrW$(114) & ChrW$(107) & ChrW$(231) & ChrW$(101) |
| 23 | தமிழ் |
Arial Unicode MS | ChrW$(2980) & ChrW$(2990) & ChrW$(3007) & ChrW$(2996) & ChrW$(3021) |
|
6 |
|
OS/ |
Unicode |
API |
Fonts |
Additional Requirements |
| Vb5/6 | Yes. Uses Unicode to store and manipulate strings. | Instrinsic controls, Properties Window(IDE), Clipboard, and PropertyBag are ANSI only. | ||
| NT/2000/XP/Vista | Yes. Uses Unicode to store and manipulate strings. | Uses Unicode: DrawTextW Lib "user32" - TextOutW Lib "gdi32" |
Installed. You may need to enable Far East language support via Control Panel, Regional Options, Languages if it was not done so at install time. | None |
| 98/ME |
No. Uses ANSI or * DBCS to store and manipulate strings. |
Uses ANSI: DrawTextA Lib "user32" - TextOutA Lib "gdi32"or DrawTextW Lib "Unicows" TextOutW Lib "Unicows" |
You need to install at least one Unicode font. Arial MS Unicode used to be a free(23Mb) download from Microsoft. It is installed automatically with Office XP Pro or Frontpage 2002. | Microsoft Layer for Unicode on Win9x Systems (MSLU).
Unicows.DLL (269.7kb download) available free from Microsoft. The current FileVersion is "1.0.4018.0" April 21, 2003. |
| Automation 95/98/ME & NT/2000/XP/Vista | Yes. Uses Unicode to pass the strings back and forth. |
XP now supports a total of 136 locales, which includes the 126 locales supported by Windows 2000 and adds the following:
Other international features New to XP:
Windows XP Service Pack 2 introduces 25 additional locales and another 11 with Service Pack 2 Update:
|
Windows XP Service Pack 2 Locales |
||
|
Bengali (India) |
Quechua (Bolivia) |
Sami, Northern (Sweden) |
|
Bosnian (Latin, Bosnia and Herzegovina) |
Quechua (Ecuador) |
Sami, Skolt (Finland) |
|
Croatian (Latin, Bosnia and Herzegovina) |
Quechua (Peru) |
Sami, Southern (Norway) |
|
isiXhosa (South Africa) Sami |
Sami, Inari (Finland) |
Sami, Southern (Sweden) |
|
isiZulu (South Africa) |
Sami, Lule (Norway) |
Serbian (Cyrillic, Bosnia and Herzegovina) |
|
Malayalam (India) |
Sami, Lule (Sweden) |
Serbian (Latin, Bosnia and Herzegovina) |
|
Maltese (Malta) Sami |
Sami, Northern (Finland) |
Sesotho sa Leboa (South Africa) |
|
Maori (New Zealand) |
Sami, Northern (Norway) |
Setswana (South Africa) |
|
|
|
Welsh (United Kingdom) |
|
Windows XP Service Pack 2 Update Locales |
||
|
Bosnian (Cyrillic, Bosnia and Herzegovina) |
Irish (Ireland) |
Nepali (Nepal) |
|
Filipino (Philippines) |
Luxembourgish (Luxembourg) |
Pashto (Afghanistan) |
|
Frisian (Netherlands) |
Mapudungun (Chile) |
Romansh (Switzerland) |
|
Inuktitut (Latin, Canada) |
Mohawk (Mohawk) |
|
|
Windows Vista New locales |
||
| Alsatian (France) | Hausa (Latin, Nigeria) | Spanish (United States) |
| Amharic (Ethiopia) | Igbo (Nigeria) | Tajik (Cyrillic, Tajikistan) |
| Assamese (India) | Inuktitut (Syllabics, Canada) | Tamazight (Latin, Algeria) |
| Bashkir (Russia) | Khmer (Cambodia) | Tibetan (PRC) |
|
Bengali (Bangladesh) |
K'iche (Guatemala) |
Turkmen (Turkmenistan) |
|
Breton (France) |
Kinyarwanda (Rwanda) |
Uighur (PRC) |
|
Corsican (France) |
Lao (Lao P.D.R.) |
Upper Sorbian (Germany) |
|
Dari (Afghanistan) |
Lower Sorbian (Germany) |
Wolof (Senegal) |
|
English (India) |
Mongolian (Traditional Mongolian, PRC) |
Yakut (Russia) |
|
English (Malaysia) |
Occitan (France) |
Yi (PRC) |
|
English (Singapore) |
Oriya (India) |
Yoruba (Nigeria) |
|
Greenlandic (Greenland) |
Sinhala (Sri Lanka) |
|
These locales are automatically installed when you update Windows XP to SP2. You can select new locales in Regional and Language Options. They are not supported in Windows Server 2003.
|
7 |
Click Icon to your
left to Load Table.
Please Wait...
|
Unicode Only LCIDs |
||
|---|---|---|
| Identifier | Language | Platform |
| 0x042b | Armenian | 2000/XP |
| 0x0465 | Divehi | XP |
| 0x0437 | Georgian | 2000/XP |
| 0x0447 | Gujarati | XP |
| 0x0439 | Hindi | 2000/XP |
| 0x044b | Kannada | XP |
| 0x0457 | Konkani | 2000/XP |
| 0x044e | Marathi | 2000/XP |
| 0x0446 | Punjabi | XP |
| 0x044f | Sanskrit | 2000/XP |
| 0x045a | Syriac | XP |
| 0x0449 | Tamil | 2000/XP |
| 0x044a | Telugu | XP |
|
8 |
Table of Known Code Pages
| CP_ACP | 0 | WesternEuropean_Mac | 10000 | UserDefined | 50000 |
| CP_OEMCP | 1 | Japanese_Mac | 10001 | AutoSelect | 50001 |
| CP_MACCP | 2 | Arabic_Mac | 10004 | Japanese_JIS | 50220 |
| CP_THREAD_ACP | 3 | Greek_Mac | 10006 | Japanese_JIS_Allow1byteKana | 50221 |
| CP_SYMBOL | 42 | Cyrillic_Mac | 10007 | Japanese_JIS_Allow1byteKanaSOSI | 50222 |
| OEM_UnitedStates | 437 | Latin2_Mac | 10029 | Korean_ISO | 50225 |
| Arabic_ASMO708 | 708 | Turkish_Mac | 10081 | Japanese_AutoSelect | 50932 |
|
Arabic_DOS |
720 | Chinese_Traditional_CNS | 20000 | Chinese_Simplified_AutoSelect | 50936 |
| Greek_DOS | 737 | Chinese_Traditional_Eten | 20002 | Korean_AutoSelect | 50949 |
| Baltic_DOS | 775 | WesternEuropean_IA5 | 20105 | Chinese_Traditional_Auto_Select | 50950 |
| WesternEuropean_DOS | 850 | German_IA5 | 20106 | Cyrillic_Auto_Select | 51251 |
| Central_European_DOS | 852 | Swedish_IA5 | 20107 | Greek_AutoSelect | 51253 |
| Icelandic_DOS | 861 | Norwegian_IA5 | 20108 | Arabic_AutoSelect | 51256 |
| Hebrew_DOS | 862 | US_ASCII | 20127 | Japanese_EUC | 51932 |
| Cyrillic_DOS | 866 | Cyrillic_KOI8R | 20866 | Chinese_Simplified_EUC | 51936 |
| Greek_DOS_Modern | 869 | Cyrillic_KOI8U | 21866 | Korean_EUC | 51949 |
| Thai_Windows | 874 | WesternEuropean_ISO | 28591 | Chinese_Simplified_HZ | 52936 |
| IBM_EBCDIC_GreekModern | 875 | Central_European_ISO | 28592 | CP_UTF7 | 65000 * |
| Japanese_ShiftJIS | 932 * | Baltic_ISO | 28594 | CP_UTF8 | 65001 * |
| Chinese_Simplified_GB2312 | 936 * | Cyrillic_ISO | 28595 | ||
| Korean | 949 * | Arabic_ISO | 28596 | ||
| Chinese_Traditional_Big5 | 950 * | Greek_ISO | 28597 | ||
| Unicode | 1200 | Latin3_ISO | 28593 | ||
| Unicode_BigEndian | 1201 | Hebrew_ISO_Visual | 28598 | ||
| Central_European_Windows | 1250 | Turkish_ISO | 28599 | ||
| Cyrillic_Windows | 1251 | Latin9_ISO | 28605 | ||
| WesternEuropean_Windows | 1252 | Europa | 29001 | ||
| Greek_Windows | 1253 | Hebrew_ISO_Logical | 38598 | ||
| Turkish_Windows | 1254 | ||||
| Hebrew_Windows | 1255 | ||||
| Arabic_Windows | 1256 | ||||
| Baltic_Windows | 1257 | ||||
| Vietnamese_Windows | 1258 | ||||
| Korean_Johab | 1361 |
* DBCS - Double-Byte Character
Set
* This is a pseudo codepage. There is no corresponding NLS file.
This code page ID can only be used with WideCharToMultiByte() and MultiByteToWideChar() API calls.
|
9 |
Fonts used on Windows XP-SP1:
| Method | API | Parameter | Font Name | API | Face Name |
| GetStockFont | GetStockObject | SYSTEM_FONT | MS Sans Serif | GetTextFace | System |
| GetStockFont | GetStockObject | DEFAULT_GUI_FONT | MS Sans Serif | GetTextFace | MS Shell Dlg |
| GetSysFontA | SystemParametersInfo | cf_Caption | Arial | ||
| GetSysFontA | SystemParametersInfo | cf_Menu | Tahoma | ||
| GetSysFontA | SystemParametersInfo | cf_Message | Tahoma | ||
| GetSysFontA | SystemParametersInfo | cf_SmallCaption | Tahoma | ||
| GetSysFontA | SystemParametersInfo | cf_Status | Tahoma |
Win2000/XP/Office2000/Office2003 should already have
Arial Unicode MS installed.
Win98 users will need a Unicode font to render Unicode glyphs.
| Font Not redistributable. |
Version | Glyphs | Size | Comments |
| Arial Unicode MS | 1.00 | 51,180 | 22.19 Mb | Installed with Win2000/Office2000 or later. |
| Lucida Sans Unicode | 2.00 | 1,776 | 316.4 kb | |
| Bitstream Cyberbit | beta v2.0 | 29,934 | 13.04 Mb | Complete - Download Font and DOC here. |
| Bitstream Cyberbase | beta v2.0 | 1,249 | 302 kb | No CJK - Download Font and DOC here. |
| Bitstream CyberCJK | beta v2.0 | 28686 | 12.74 Mb | CJK only - Download Font and DOC here. |
| Code2000 | 1.13 | 34,810 | 3.01 Mb | Shareware. US$5.00. Download here. |
| TITUS Cyberbit Basic | 2000; 3.0 | 9,568 | 1.82 Mb | Non-Commercial use only. UNICODE 4.0 compliant. Download here. |
| Doulos SIL | 4.010.2004 | 674 kb |
DoulosSIL4.0.10.zip. Non-Commercial use only. License/DistributionRestrictions |
|
| Ezra SIL Hebrew Unicode |
3.20 Mb |
EzrSIL20.zip Non-Commercial use only. License/DistributionRestrictions |
Sample Font Support for several Languages
Test Platform WinXP-SP1
| Sample | Arial Unicode MS |
Code 2000 |
* Microsoft Tahoma or Arial |
* Microsoft Sans Serif |
Bitstream Cyberbit |
TITUS Cyberbit Basic |
| ARA: مـرحبــاً | ||||||
| CHS: 欢迎 | ||||||
| CHT: 歡迎 | ||||||
| GEO: სასურველი | ||||||
| GRK: Καλώς ήλθατε | ||||||
| HEB: בִּרוּבִים חַבָּאִים | ||||||
| HIN: रवागत | ||||||
| JPN: よろてそ | ||||||
| KOR: 여보세요 | ||||||
| PAN: ਜੀ ਆਇਆਂ ਨੂੰ | ||||||
| RUS: Добро пожаловать | ||||||
| TAM: 쏅얧주ø | ||||||
| THA: การต้อนรับ | ||||||
| URD: स्वागत | ||||||
| VIE: tính từ |
* Tahoma, Arial do not support all these languages but works due to Uniscribe Font Fallback. Apparently MS San Serif does not use or support Font Fallback. Font Fallback is only available on Platforms Win2000 or later.
|
10 |
Microsoft Layer for Unicode Technology(UNICOWS.DLL)

While you can make a separate programs for specific platforms it is often desirable to make one program that will work on all platforms. By using Microsoft Layer for Unicode Technology (Unicows.DLL, 240kb), a single executable can run on both NT-based and Win9x Platforms. In this case you can use DrawTextW or TextOutW Lib "Unicows" for all platforms.
The Unicows.Dll forwards calls to the system API if you are running on NT, 2000, or XP platforms.
MSLU does not support the display of characters that the system cannot display. Therefore do not expect to see Chinese under Win98 English even though you have installed a font that supports Chinese(Arial Unicode MS for example). For more info see Newsgroup microsoft.public.platformsdk.mslayerforunicode and http://trigeminal.com/usenet/usenet035.asp.
Use this conditional compilation directive to test your code with and without Unicows:
Under Project/Properties/Make set conditional compilation arguments to UNICOWS = -1 or UNICOWS = 0.
Note:
You may not need MSLU at all if your program uses only DrawText or TextOut. In this case you can simply wrap the ANSI and Wide versions into a Sub. Do not expect to see Unicode on Win9x platforms even if you are using a Unicode font such as Arial Unicode MS:
| 'Put this in your startup (Initialise,Sub
Main, etc.) Dim m_bIsNt as Boolean ' Are we running NT? Dim lVer As Long lVer = GetVersion() m_bIsNt = ((lVer And &H80000000) = 0) |
|
Public Sub pDrawText(ByVal hdc
As Long, ByVal s As String, tR As RECT, ByVal lFlags As Long) |
| Public Sub pTextOut(ByVal lhDC As
Long, ByVal x As Long, ByVal y As Long, ByVal sText As String) Dim lPtr As Long If (m_bIsNt) Then lPtr = StrPtr(sText) If Not (lPtr = 0) Then TextOutW lhDC, x, y, lPtr, Len(sText) End If Else TextOutA lhDC, x, y, sText, Len(sText) End If End Sub |
| Public Sub pGetTextExtentPoint32(ByVal
hdc As Long, ByVal s As String, lpSize As SIZEAPI) Dim lPtr As Long If (m_bIsNt) Then lPtr = StrPtr(s) If Not (lPtr = 0) Then GetTextExtentPoint32W hdc, lPtr, Len(s), lpSize End If Else GetTextExtentPoint32A hdc, s, Len(s), lpSize End If End Sub |
|
11 |
Uniscribe Architecture

In 1999 Microsoft introduced Uniscribe, a Windows system-level component that could take advantage of OpenType fonts. Microsoft Windows 2000 and applications Internet Explorer 5 and Office 2000 were released with support for Uniscribe built in.
For Windows 2000 and later, supports the processing of complex scripts, that
is, those scripts that need special processing to properly render them. It
includes a subset of the features found in GDI+ in Windows 2000 and Windows XP.
The rules governing the shaping and positioning of glyphs are specified and
catalogued in
The Unicode Standard: Worldwide Character Encoding, Version 2.0,
Addison-Wesley Publishing Company.
http://msdn.microsoft.com/library/en-us/mslu/winprog/other_existing_unicode_support.asp?frame=true
http://www.microsoft.com/msj/1198/multilang/multilang.aspx
A complex script has at least one of the following attributes:
You may wonder how WinXP
displays Unicode correctly even when you haven't selected a Font which supports
all the required characters.
"Font fallback: this mechanism, made available through Uniscribe (see section on
Complex Scripts Support), provides a fallback font (or a default font) when
dealing with complex scripts. If the selected font face does not include any
glyphs for the complex script that is about to be displayed, Uniscribe selects a
default hardcoded font for the given script. For example, if you have Hindi text
and the font is Courier, then Uniscribe will use the Mangal font. This technique
is internal to Uniscribe and developers can not add additional fonts to the list
of fallback fonts."
Note: Set flags to SSA_FALLBACK
Uniscribe is installed with Internet Explorer 5.0 or later, MS Office, Win2000, WinXP. Here are some versions of Uniscribe I found (including one found on Win98SE):
|
usp10.dll FileVersion Note: Not redistributable. |
Size (bytes) |
TimeDateStamp (Internal) |
Comments | ||
| 1.0163.1890.1 | 268,288 | 22-Sep-1998 | 23:04:38 | Microsoft Systems Journal Nov 98 Download code here |
|
| 1.0325.2180.1 | 315.152 | 30-Nov-1999 | 09:34:40 | Found on Win98SE \Windows\System Download from DLL World or Microsoft |
|
| 1.0400.2411.1 | Installed with Internet Explorer 6 | ||||
| 1.0405.2415.1 | (lab06_N.010104-1344) | 325,120 | 06-Jan-2001 | 05:14:26 | MS Office 10 common archives |
| 1.0409.2600.1106 | (xpsp1.020828-1920) | 339,456 | 09-Sep-2002 | 21:05:43 | XP-SP1 \Windows\System32 |
| 1.0420.2600.2180 | (xpsp_sp2_rtm.040803-2158) | 406,528 | 07-Dec-2005 | 14:38 | XP-SP2 \Windows\System32 |
| 1.0453.3665.0 | (private/Lab06_dev(paulnel). 020427-0653) |
397,312 | 06-Aug-2002 | 23:14:38 | |
| 1.0471.4030.0 | (main.030626-1414) | 413,184 | 27-Jun-2003 | 10:24:14 | Microsoft Office 2003 |
Best results in tests run on Win98SE has been with Uniscribe version 1.0405.2415.1. Microsoft Office 2003 version has not been tested yet.
A Vb wrapper for this library can be found in Internationalization with Visual Basic by Michael S. Kaplan. It comes
with a CD containing sample sourcecode. The sample includes a Uniscribe-aware
version of ExtTextOutW. More info
here.
A more complex C++ example can be found at "Supporting Multilanguage Text Layout and Complex Scripts with Windows NT 5.0". Dont be mislead by 'Windows NT 5.0' in the title because this demo also works on Win98. More info here.
| Logical characters: | Display Plain Text and handle caret placement: | Display Formatted Text and handle caret placement: |
|
|
ARA: العربية |
العربية :ARA |
|
|
|
This sample has been update and can be found on the Microsoft® Platform SDK(August 2002 Edition, Windows XP SP1) if you have it installed under C:\Program Files\Microsoft SDK\Samples\winui\globaldev\CSSamp. You may encounter problems compiling this due to missing or outdated files.
| File | Copy From | Copy To |
| Shlwapi.h 12-Jul-2002 60,270 bytes |
Microsoft Visual Studio .NET 2003\Vc7\PlatformSDK\Include | Microsoft Visual Studio\VC98\Include |
| ShTypes.h 05-Aug-2002 6,622 bytes |
Microsoft Visual Studio .NET 2003\Vc7\PlatformSDK\Include | Microsoft Visual Studio\VC98\Include |
| usp10.h 15-Aug-2002 81,839 bytes |
Microsoft Visual Studio .NET 2003\Vc7\PlatformSDK\Include | Microsoft SDK\Samples\winui\globaldev\CSSamp |
To build an application that supports Unicode on all Platforms AND uses Uniscribe you could use something similar to this:
| Public Sub pDrawText(ByVal hdc As
Long, ByVal s As String, tR As RECT, ByVal lFlags As Long) Dim lPtr As Long If (IsNt) Then lPtr = StrPtr(s) If Not (lPtr = 0) Then DrawTextW hdc, lPtr, -1, tR, lFlags End If Else If (IsUnicode(s)) Then If (HasUniscribe) Then DrawTextU hdc, s, tR, lFlags 'Uniscribe Wrapper Else DrawTextM hdc, s, tR, lFlags 'MultiByte Wrapper End If Else DrawTextA hdc, s, -1, tR, lFlags End If End If End Sub |
|
12 |
Provides services for applications on international issues, including
conversion between code pages, font linking, code page "guessing", line
breaking, and more. Installed with Internet Explorer 5.5 or later.
http://msdn.microsoft.com/workshop/misc/mlang/mlang.asp
http://msdn.microsoft.com/workshop/misc/mlang/reference/objects/CMultiLanguage.asp
The only Vb wrapper for this library I can find is here.
|
13 |
Provides a programming interface(control) for formatting text. This can be used in lieu of Fm20.Dll Unicode TextBox. No distribution issues and comes with source code.
| Rich Edit version |
Unicode | New | DLL | XP - SP1 | XP | Me | 2000 | NT | 98 | 95 |
|---|---|---|---|---|---|---|---|---|---|---|
| 1.0 |
|
Riched32.dll | Emulator |
Emulator |
Emulator |
|||||
| 2.0 | Supports Unicode | Riched20.dll | May be installed | |||||||
| 3.0 | Expanded support for complex scripts, partly due to Uniscribe. | Riched20.dll | ||||||||
| 4.1 |
|
Hyphenation, page rotation, and Text Services Framework (TSF) support. | Msftedit.dll |
Links:
http://msdn.microsoft.com/library/psdk/winui/richedit_5a7n.htm
http://msdn.microsoft.com/library/en-us/shellcc/platform/commctls/richedit/richeditcontrols.asp
About Rich Edit Controls
Tutorial 33: RichEdit Control: Basics
Rich Edit Controls
| Control | Price | Unicode Aware NT/2000/XP |
Link |
Notes |
| vbAccelerator RichEdit control (164kb) | Free | http://www.vbaccelerator.com/codelib/richedit/richedit.htm | Source code included | |
| ctl_riched.msi (330Kb) | Free | Source code included |
Screenshot


|
14 |
GDI+ ships with Windows XP and 2000(SP3). You can also install it on other windows platforms(98 or later). It is a free download from Microsoft at gdiplus_dnld.exe. It is Unicode aware on all platforms and allows you to draw high quality Unicode antialiased text. There is also support for rotation, skewing, and text outlining(Logo style).

To use Unicode in Win98 you must install a Unicode font such as Arial Unicode MS.Complex glyphs such as Chinese will render in Win98SE English version even though the operating system doesn't support this via DrawTextW or TextOutW! How does it do this? GDI+ uses Uniscribe under the hood to render Unicode.
You can find more information on using GDI+ in Visual Basic here
| Sample Vb source code, declares, and class wrapper. | PlanetSourceCode | Enter GDI+ in the search window at top of page. |
| Sample Vb source code and class wrapper. | www.vbaccelerator.com | |
| GDI+ Type Library (GDIPlus.TLB) version 1.30 Updated Thursday, 02 Aug 2003 06:51:14 UTC |
GDI+ Type Library | This Type Library provides everything you need to start working with GDI+ in Visual Basic giving you direct access to the native GDI+ functions. The TLB only exposes the Native(Flat) exports from the GDI+ DLL and not the C++ class methods. |
Note: Most examples you will find on the WEB use class wrappers which are geared to C++/.Net users. The Vb equivalent wrappers on PSC and VbAccelerator are not complete so you may want to stick with local GDIplus Declares or use a Type Library.
See advantage of using the TLB here.
|
15 |
From antiquity, the routines DrawText(), TextOut() and (more recently) ExtTextOut() have been the work-horse APIs used to draw text. Except on OS editions localized for regions of the world where complex scripts were needed, these routines did not do any complex script handling. And in fact they still do not, except on Windows 2000 and XP. On those operating systems, which come with Uniscribe as a standard system component, these standard APIs now do complex script shaping even for programs that have not been designed for it or are not expecting it.
If you are coding a User Control or subclassing a control to take advantage of Owner Draw or Custom Draw features then use API DrawTextW or TextOutW as follows:
Note: lFlags values can be found here: DrawText Align Flags.
|
16 |
|
Updated 31-August-2010 14:43 -0300
|
|||
|
Trademarks
Credits
|
|
17 |
The intrinsic controls that ship with Vb6 are not Unicode aware so you have several options.
Note: Most of the above techniques will not work under Themed WinXP under the following conditions:
|
|
|
Forms 2.0 Object Library (Unicode) |
|
| Advantages | Disadvantages |
| Although Microsoft does not recommend or provide support for use of these controls under Visual Basic they do appear to work well despite the disclaimer. | No Microsoft support. |
| Easy way to get limited Unicode support in your projects. | No Uniscribe support (No Font fallback or Complex Scripts). Because of this the control Font must contain ALL the characters you are trying to render. |
| Not redistributable however anyone can get it for free by downloading MS ActiveXControlPad. | No XP theme support. |
| Requires FM20ENU.Dll (installed only with US versions of Office). | |
| Windowless controls thus no hWnd. | |
| Requires at least VB6-SP4 to use IME(Input Method Editor). | |
As can be seen above it is
a easy way to get limited Unicode support in your projects but falls
way short of something you would want to use in a Commercially distributed
application.
|
18 |
UTF8 is a lossless method of encoding UTF16 Unicode in 1-3 bytes.
| Unicode Hex From-To |
Unicode Dec From-To |
Output Bytes |
| 0..7F | 0..127 | 1 |
| 80..7FF | 128..2047 | 2 |
| 800..FFFF | 2048..65535 | 3 |
According to HTML standard, HTML files should be transmitted in Big-Endian format and using Byte-Order Mark (BOM) is recommended. Since Windows Native format is Little-Endian, to avoid the Endian conversion problem, it is recommended to use UTF-8 encoding in HTML files. In addition to converting your strings to UTF-8 add the following MetaTag to indicate your content as being UTF-8:
"<HEAD><meta http-equiv='Content-Type' content='text/html;
charset=utf-8'></HEAD>"
| Platforms | Function | Comments |
| All | WideCharToMultiByte Lib "kernel32" MultiByteToWideChar Lib "kernel32" |
Codepage CP_UTF8 = 65001 This is a pseudo codepage. There is no corresponding NLS file. This code page ID can only be used with WideCharToMultiByte() and
MultiByteToWideChar() API calls. |
| All | ConvertINetMultiByteToUnicode Lib "mlang" ConvertINetUnicodeToMultiByteLib "mlang" |
Minimum availability Internet Explorer 5.5 |
Brute force conversions and API equivalents
|
19 |
Sample Grid saved as HTML(UTF-8 encoded). Supports Header, Filter Bar, Icons, Pictures, Backgrounds, Alignment, Unicode text, Cell ForeColor/BackColor, and Hyperlinks
DemoHeader - ExportHTML
| Name | Multiline Headers CHS: 欢迎 JPN: よろてそ |
Type | Modified | GEO: სასურველი | Col 6 | Col 7 | Col 8 | Col 9 | Col 10 | ||||||||||
| 0,707107 | 2007-190 |
|
Arabic | United Arab Emirates | Abu Dhabi | Row1,Col10 | |||||||||||||
| 316.615 | 2008-359 |
|
Chinese Simplified | China | Beijing | Row2,Col10 | |||||||||||||
| 316.615 | 2009-013 | Chinese Traditional | Taiwan | Taipei | Row3,Col10 | ||||||||||||||
| www.unisuite.com | 2009-120 |
|
English | United States | Washington, DC | Row4,Col10 | |||||||||||||
|
HTML Table
|
![]() |
Georgian | Georgia | Tbilisi | Row5,Col10 | ||||||||||||||
|
HTML Markers
|
|
![]() |
Greek | Greece | Athens | Row6,Col10 | |||||||||||||
| 491.467 | 2005-201 |
|
Hebrew | Israel | Jerusalem | Row7,Col10 | |||||||||||||
| 276.606 | 2005-299 |
|
Hindi | India | New Delhi | Row8,Col10 | |||||||||||||
| 617.782 | 2011-311 |
|
Japanese | Japan | Tokyo | Row9,Col10 | |||||||||||||
| 728.881 | 2011-076 |
|
Korean | Korea | Seoul | Row10,Col10 | |||||||||||||
| 111.536 | 2011-212 |
|
Punjabi | Pakistan | Islamabad | Row11,Col10 | |||||||||||||
| 603.123 | 2004-025 |
|
Portuguese(BR) | Brazil | Brasilia | Row12,Col10 | |||||||||||||
| 298.299 | 2003-102 |
|
Russian | Russian Federation | Moscow | Row13,Col10 | |||||||||||||
| 315.590 | 2011-346 |
|
Tamil | India | New Delhi | Row14,Col10 | |||||||||||||
| 291.797 | 2004-051 |
|
Thai | Thailand | Bangkok | Row15,Col10 | |||||||||||||
| 429.992 | 2006-253 |
|
Urdu | Pakistan | Islamabad | Row16,Col10 | |||||||||||||
| 663.936 | 2005-049 |
|
Vietnamese | Vietnam | Hanoi | Row17,Col10 | |||||||||||||
| 84.635 | 2007-312 |
|
Arabic | United Arab Emirates | Abu Dhabi | Row18,Col10 | |||||||||||||
| 823.354 | 2006-088 |
|
Chinese Simplified | China | Beijing | Row19,Col10 | |||||||||||||
| 662.429 | 2008-124 |
|
Chinese Traditional | Taiwan | Taipei | Row20,Col10 | |||||||||||||
| 588.293 | 2009-330 |
|
English | United States | Washington, DC | Row21,Col10 | |||||||||||||
| 23.728 | 2007-317 |
|
Georgian | Georgia | Tbilisi | Row22,Col10 | |||||||||||||
| 710.879 | 2007-165 |
|
Greek | Greece | Athens | Row23,Col10 | |||||||||||||
| 370.642 | 2006-061 |
|
Hebrew | Israel | Jerusalem | Row24,Col10 | |||||||||||||
| 255.690 | 2011-012 |
|
Hindi | India | New Delhi | Row25,Col10 |
DemoFooter - ExportHTML
|
20 |
Exporting Unicode to a file usually involves prefixing the text with a BOM (Byte Order Mark). See the table above to see what prefix is required depending on the Unicode format. Using UTF-8 although it takes more space than UTF-16 is recommended since it is easier to transport to other platforms and easier to use in HTML.
See FileIO for methods on how to Read/Write Unicode files.
|
21 |
The VB Clipboard functions are ANSI only. If you need to access the Clipboard 'Unicode string' you must use format 'CF_UNICODETEXT = 13'. Code to do this can be found at http://www.vbaccelerator.com/home/VB/Code/Libraries/Clipboard/Customising_Clipboard_Use/article.asp
Standard Clipboard Formats
| CF_TEXT = 1 | CF_PENDATA = 10 | CF_OWNERDISPLAY = &H80 |
| CF_BITMAP = 2 | CF_RIFF = 11 | CF_DSPTEXT = &H81 |
| CF_METAFILEPICT = 3 | CF_WAVE = 12 | CF_DSPBITMAP = &H82 |
| CF_SYLK = 4 | CF_UNICODETEXT = 13 | CF_DSPMETAFILEPICT = &H83 |
| CF_DIF = 5 | CF_ENHMETAFILE = 14 | CF_DSPENHMETAFILE = &H8E |
| CF_TIFF = 6 | CF_HDROP = 15 | CF_PRIVATEFIRST = &H200 |
| CF_OEMTEXT = 7 | CF_LOCALE = 16 | CF_PRIVATELAST = &H2FF |
| CF_DIB = 8 | CF_DIBV5 = 17 | CF_GDIOBJFIRST = &H300 |
| CF_PALETTE = 9 | CF_MAX = 18 | CF_GDIOBJLAST = &H3FF |
Windows Message Clipboard Constants
| WM_PASTE = &H302 |
| WM_CLEAR = &H303 |
| WM_COPY = &H301 |
| WM_CUT = &H300 |
| WM_UNDO = &H304 |
Windows Clipboard API's
| Declare Function CloseClipboard Lib "user32" () As Long |
| Declare Function CloseClipboard Lib "user32" () As Long |
| Declare Function EmptyClipboard Lib "user32" () As Long |
| Declare Function GetClipboardData Lib "user32" (ByVal wFormat As Long) As Long |
| Declare Function IsClipboardFormatAvailable Lib "user32" (ByVal wFormat As Long) As Long |
| Declare Function OpenClipboard Lib "user32" (ByVal hwnd As Long) As Long |
| Declare Function SetClipboardData Lib "user32" (ByVal wFormat As Long, ByVal hMem As Long) As Long |
|
22 |

| Block Name | From | To | # Codepoints |
Font |
|---|---|---|---|---|
| Basic Latin | U+0000 | U+007F | (128) | Tahoma |
| Latin-1 Supplement | U+0080 | U+00FF | (128) | Tahoma |
| Latin Extended-A | U+0100 | U+017F | (128) | Tahoma |
| Latin Extended-B | U+0180 | U+024F | (208) | Tahoma |
| IPA Extensions | U+0250 | U+02AF | (96) | Arial Unicode MS |
| Spacing Modifier Letters | U+02B0 | U+02FF | (80) | Tahoma |
| Combining Diacritical Marks | U+0300 | U+036F | (112) | Tahoma |
| Greek and Coptic | U+0370 | U+03FF | (127) | Tahoma |
| Cyrillic | U+0400 | U+04FF | (255) | Tahoma |
| Cyrillic Supplement | U+0500 | U+052F | (20) | Tahoma |
| Armenian | U+0530 | U+058F | (86) | Sylfaen |
| Hebrew | U+0590 | U+05FF | (87) | David |
| Arabic | U+0600 | U+06FF | (235) | Arial Unicode MS |
| Syriac | U+0700 | U+074F | (77) | Estrangelo Edessa |
| Arabic Supplement | U+0750 | U+077F | (30) | Arial Unicode MS |
| Thaana | U+0780 | U+07BF | (50) | MV Boli |
| NKo | U+07C0 | U+07FF | (59) | Arial Unicode MS |
| Devanagari | U+0900 | U+097F | (110) | Mangal |
| Bengali | U+0980 | U+09FF | (91) | Vrinda |
| Gurmukhi | U+0A00 | U+0A7F | (77) | Raavi |
| Gujarati | U+0A80 | U+0AFF | (83) | Shruti |
| Oriya | U+0B00 | U+0B7F | (81) | Arial Unicode MS |
| Tamil | U+0B80 | U+0BFF | (71) | Latha |
| Telugu | U+0C00 | U+0C7F | (80) | Guatami |
| Kannada | U+0C80 | U+0CFF | (86) | Tunga |
| Malayalam | U+0D00 | U+0D7F | (78) | Kartika |
| Sinhala | U+0D80 | U+0DFF | (80) | Arial Unicode MS |
| Thai | U+0E00 | U+0E7F | (87) | Angsana New |
| Lao | U+0E80 | U+0EFF | (65) | Arial Unicode MS |
| Tibetan | U+0F00 | U+0FFF | (195) | Arial Unicode MS |
| Myanmar | U+1000 | U+109F | (78) | Arial Unicode MS |
| Georgian | U+10A0 | U+10FF | (83) | Sylfang |
| Hangul Jamo | U+1100 | U+11FF | (240) | Batang |
| Ethiopic | U+1200 | U+137F | (356) | Arial Unicode MS |
| Ethiopic Supplement | U+1380 | U+139F | (26) | Arial Unicode MS |
| Cherokee | U+13A0 | U+13FF | (85) | Arial Unicode MS |
| Unified Canadian Aboriginal Syllabics | U+1400 | U+167F | (630) | Arial Unicode MS |
| Ogham | U+1680 | U+169F | (29) | Arial Unicode MS |
| Runic | U+16A0 | U+16FF | (81) | Arial Unicode MS |
| Tagalog | U+1700 | U+171F | (20) | Arial Unicode MS |
| Hanunoo | U+1720 | U+173F | (23) | Arial Unicode MS |
| Buhid | U+1740 | U+175F | (20) | Arial Unicode MS |
| agbanwa | U+1760 | U+177F | (18) | Arial Unicode MS |
| Khmer | U+1780 | U+17FF | (114) | Arial Unicode MS |
| Mongolian | U+1800 | U+18AF | (155) | Arial Unicode MS |
| Limbu | U+1900 | U+194F | (66) | Arial Unicode MS |
| Tai Le | U+1950 | U+197F | (35) | Arial Unicode MS |
| New Tai Lue | U+1980 | U+19DF | (80) | Arial Unicode MS |
| Khmer Symbols | U+19E0 | U+19FF | (32) | Arial Unicode MS |
| Buginese | U+1A00 | U+1A1F | (30) | Arial Unicode MS |
| Balinese | U+1B00 | U+1B7F | (121) | Arial Unicode MS |
| Phonetic Extensions | U+1D00 | U+1D7F | (128) | Arial Unicode MS |
| Phonetic Extensions Supplement | U+1D80 | U+1DBF | (64) | Arial Unicode MS |
| Combining Diacritical Marks Supplement | U+1DC0 | U+1DFF | (13) | Arial Unicode MS |
| Latin Extended Additional | U+1E00 | U+1EFF | (246) | Tahoma |
| Greek Extended | U+1F00 | U+1FFF | (233) | Tahoma |
| General Punctuation | U+2000 | U+206F | (106) | Tahoma |
| Superscripts and Subscripts | U+2070 | U+209F | (34) | Tahoma |
| Currency Symbols | U+20A0 | U+20CF | (22) | Tahoma |
| Combining Diacritical Marks for Symbols | U+20D0 | U+20FF | (32) | Tahoma |
| Letterlike Symbols | U+2100 | U+214F | (79) | Tahoma |
| Number Forms | U+2150 | U+218F | (50) | Tahoma |
| Arrows | U+2190 | U+21FF | (112) | Arial Unicode MS |
| Mathematical Operators | U+2200 | U+22FF | (256) | Arial Unicode MS |
| Miscellaneous Technical | U+2300 | U+23FF | (232) | Tahoma |
| Control Pictures | U+2400 | U+243F | (39) | Tahoma |
| Optical Character Recognition | U+2440 | U+245F | (11) | Tahoma |
| Enclosed Alphanumerics | U+2460 | U+24FF | (160) | Tahoma |
| Box Drawing | U+2500 | U+257F | (128) | Tahoma |
| Block Elements | U+2580 | U+259F | (32) | Tahoma |
| Geometric Shapes | U+25A0 | U+25FF | (96) | Tahoma |
| Miscellaneous Symbols | U+2600 | U+26FF | (176) | Tahoma |
| Dingbats | U+2700 | U+27BF | (174) | Tahoma |
| Miscellaneous Mathematical Symbols-A | U+27C0 | U+27EF | (39) | Arial Unicode MS |
| Supplemental Arrows-A | U+27F0 | U+27FF | (16) | Arial Unicode MS |
| Braille Patterns | U+2800 | U+28FF | (256) | Arial Unicode MS |
| Supplemental Arrows-B | U+2900 | U+297F | (128) | Arial Unicode MS |
| Miscellaneous Mathematical Symbols-B | U+2980 | U+29FF | (128) | Arial Unicode MS |
| Supplemental Mathematical Operators | U+2A00 | U+2AFF | (256) | Arial Unicode MS |
| Miscellaneous Symbols and Arrows | U+2B00 | U+2BFF | (31) | Arial Unicode MS |
| Glagolitic | U+2C00 | U+2C5F | (94) | Arial Unicode MS |
| Latin Extended-C | U+2C60 | U+2C7F | (17) | Arial Unicode MS |
| Coptic | U+2C80 | U+2CFF | (114) | Arial Unicode MS |
| Georgian Supplement | U+2D00 | U+2D2F | (38) | Sylfaen |
| Tifinagh | U+2D30 | U+2D7F | (55) | Arial Unicode MS |
| Ethiopic Extended | U+2D80 | U+2DDF | (79) | Arial Unicode MS |
| Supplemental Punctuation | U+2E00 | U+2E7F | (26) | Tahoma |
| CJK Radicals Supplement | U+2E80 | U+2EFF | (115) | MingLiU |
| Kangxi Radicals | U+2F00 | U+2FDF | (214) | MingLiU |
| Ideographic Description Characters | U+2FF0 | U+2FFF | (12) | MingLiU |
| CJK Symbols and Punctuation | U+3000 | U+303F | (64) | MingLiU |
| Hiragana | U+3040 | U+309F | (93) | MS Gothic |
| Katakana | U+30A0 | U+30FF | (96) | MS Gothic |
| Bopomofo | U+3100 | U+312F | (40) | MingLiU |
| Hangul Compatibility Jamo | U+3130 | U+318F | (94) | Batang |
| Kanbun | U+3190 | U+319F | (16) | MingLiU |
| Bopomofo Extended | U+31A0 | U+31BF | (24) | MingLiU |
| CJK Strokes | U+31C0 | U+31EF | (16) | MingLiU |
| Katakana Phonetic Extensions | U+31F0 | U+31FF | (16) | MS Gothic |
| Enclosed CJK Letters and Months | U+3200 | U+32FF | (242) | MingLiU |
| CJK Compatibility | U+3300 | U+33FF | (256) | MingLiU |
| CJK Unified Ideographs Extension A | U+3400 | U+4DBF | (6582) | MingLiU |
| Yijing Hexagram Symbols | U+4DC0 | U+4DFF | (64) | MingLiU |
| CJK Unified Ideographs | U+4E00 | U+9FFF | (20924) | MingLiU |
| Yi Syllables | U+A000 | U+A48F | (1165) | Arial Unicode MS |
| Yi Radicals | U+A490 | U+A4CF | (55) | Arial Unicode MS |
| Modifier Tone Letters | U+A700 | U+A71F | (27) | MingLiU |
| Latin Extended-D | U+A720 | U+A7FF | (2) | Arial Unicode MS |
| Syloti Nagri | U+A800 | U+A82F | (44) | Arial Unicode MS |
| Phags-pa | U+A840 | U+A87F | (56) | Tahoma |
| Hangul Syllables | U+AC00 | U+D7AF | (2) | Batang |
| High Surrogates | U+D800 | U+DB7F | (2) | Arial Unicode MS |
| High Private Use Surrogates | U+DB80 | U+DBFF | (2) | ?? |
| Low Surrogates | U+DC00 | U+DFFF | (2) | Arial Unicode MS |
| Private Use Area | U+E000 | U+F8FF | (2) | Arial Unicode MS |
| CJK Compatibility Ideographs | U+F900 | U+FAFF | (467) | MingLiU |
| Alphabetic Presentation Forms | U+FB00 | U+FB4F | (58) | Tahoma |
| Arabic Presentation Forms-A | U+FB50 | U+FDFF | (595) | Arial Unicode MS |
| Variation Selectors | U+FE00 | U+FE0F | (16) | Tahoma |
| Vertical Forms | U+FE10 | U+FE1F | (10) | MingLiU |
| Combining Half Marks | U+FE20 | U+FE2F | (4) | Tahoma |
| CJK Compatibility Forms | U+FE30 | U+FE4F | (32) | MingLiU |
| Small Form Variants | U+FE50 | U+FE6F | (26) | MingLiU |
| Arabic Presentation Forms-B | U+FE70 | U+FEFF | (141) | Arial Unicode MS |
| Halfwidth and Fullwidth Forms | U+FF00 | U+FFEF | (225) | MingLiU |
| Specials | U+FFF0 | U+FFFF | (5) | Tahoma |
| Linear B Syllabary | U+10000 | U+1007F | (88) | Tahoma |
| Linear B Ideograms | U+10080 | U+100FF | (123) | Tahoma |
| Aegean Numbers | U+10100 | U+1013F | (57) | Tahoma |
| Ancient Greek Numbers | U+10140 | U+1018F | (75) | Tahoma |
| Old Italic | U+10300 | U+1032F | (35) | Tahoma |
| Gothic | U+10330 | U+1034F | (27) | Tahoma |
| Ugaritic | U+10380 | U+1039F | (31) | Tahoma |
| Old Persian | U+103A0 | U+103DF | (50) | Tahoma |
| Deseret | U+10400 | U+1044F | (80) | Tahoma |
| Shavian | U+10450 | U+1047F | (48) | Tahoma |
| Osmanya | U+10480 | U+104AF | (40) | Tahoma |
| Cypriot Syllabary | U+10800 | U+1083F | (55) | Tahoma |
| Phoenician | U+10900 | U+1091F | (27) | Tahoma |
| Kharoshthi | U+10A00 | U+10A5F | (65) | Tahoma |
| Cuneiform | U+12000 | U+123FF | (879) | Tahoma |
| Cuneiform Numbers and Punctuation | U+12400 | U+1247F | (103) | Tahoma |
| Byzantine Musical Symbols | U+1D000 | U+1D0FF | (246) | Tahoma |
| Musical Symbols | U+1D100 | U+1D1FF | (219) | Tahoma |
| Ancient Greek Musical Notation | U+1D200 | U+1D24F | (70) | Tahoma |
| Tai Xuan Jing Symbols | U+1D300 | U+1D35F | (87) | Tahoma |
| Counting Rod Numerals | U+1D360 | U+1D37F | (18) | Tahoma |
| Mathematical Alphanumeric Symbols | U+1D400 | U+1D7FF | (996) | Tahoma |
| CJK Unified Ideographs Extension B | U+20000 | U+2A6DF | (42711) | Tahoma |
| CJK Compatibility Ideographs Supplement | U+2F800 | U+2FA1F | (542) | Tahoma |
| Tags | U+E0000 | U+E007F | (97) | Tahoma |
| Variation Selectors Supplement | U+E0100 | U+E01EF | (240) | Tahoma |
| Supplementary Private Use Area-A | U+F0000 | U+FFFFF | (2) | Arial Unicode MS |
| Supplementary Private Use Area-B | U+100000 | U+10FFFF | (2) | Arial Unicode MS |
|
23 |
These are the flags you can set when using API DrawTextW. Combine using the OR operator. Example: DT_CENTER OR DT_WORDBREAK
| Name | Value | Description |
|---|---|---|
| DT_TOP | &H0& | Justifies the text to the top of the rectangle. |
| DT_LEFT | &H0& | Aligns text to the left. |
| DT_CENTER | &H1& | Centers text horizontally in the rectangle. |
| DT_RIGHT | &H2& | Aligns text to the right. |
| DT_VCENTER | &H4& | Centers text vertically. This value is used only with the DT_SINGLELINE value. |
| DT_BOTTOM | &H8& | Justifies the text to the bottom of the rectangle. This value is used only with the DT_SINGLELINE value. |
| DT_WORDBREAK | &H10& | Breaks words. Lines are automatically broken
between words if a word would extend past the edge of the rectangle specified by
the lpRect parameter. A carriage return-line feed sequence also breaks
the line.
If this is not specified, output is on one line. |
| DT_SINGLELINE | &H20& | Displays text on a single line only. Carriage returns and line feeds do not break the line. |
| DT_EXPANDTABS | &H40& | Expands tab characters. The default number of characters per tab is eight. The DT_WORD_ELLIPSIS, DT_PATH_ELLIPSIS, and DT_END_ELLIPSIS values cannot be used with the DT_EXPANDTABS value. |
| DT_TABSTOP | &H80& | Sets tab stops. Bits 158 (high-order byte of the low-order word) of the uFormat parameter specify the number of characters for each tab. The default number of characters per tab is eight. The DT_CALCRECT, DT_EXTERNALLEADING, DT_INTERNAL, DT_NOCLIP, and DT_NOPREFIX values cannot be used with the DT_TABSTOP value. |
| DT_NOCLIP | &H100& | Draws without clipping. DrawText is somewhat faster when DT_NOCLIP is used. |
| DT_EXTERNALLEADING | &H200& | Includes the font external leading in line height. Normally, external leading is not included in the height of a line of text. |
| DT_CALCRECT | &H400& | Determines the width and height of the rectangle. If there are multiple lines of text, DrawText uses the width of the rectangle pointed to by the lpRect parameter and extends the base of the rectangle to bound the last line of text. If the largest word is wider than the rectangle, the width is expanded. If the text is less than the width of the rectangle, the width is reduced. If there is only one line of text, DrawText modifies the right side of the rectangle so that it bounds the last character in the line. In either case, DrawText returns the height of the formatted text but does not draw the text. Thus it should never be used as a CellTextFormat parameter. |
| DT_NOPREFIX | &H800& | Turns off processing of prefix characters.
Normally, DrawText interprets the mnemonic-prefix character & as a
directive to underscore the character that follows, and the mnemonic-prefix
characters && as a directive to print a single &. By specifying DT_NOPREFIX,
this processing is turned off. For example,
input string: "A&bc&&d" normal: "Abc&d" DT_NOPREFIX: "A&bc&&d" Compare with DT_HIDEPREFIX and DT_PREFIXONLY. |
| DT_INTERNAL | &H1000& | Uses the system font to calculate text metrics. |
| DT_EDITCONTROL | &H2000& | Duplicates the text-displaying characteristics of a multiline edit control. Specifically, the average character width is calculated in the same manner as for an edit control, and the function does not display a partially visible last line. |
| DT_PATH_ELLIPSIS | &H4000& | For displayed text, replaces characters in the
middle of the string with ellipses so that the result fits in the specified
rectangle. If the string contains backslash (\) characters, DT_PATH_ELLIPSIS
preserves as much as possible of the text after the last backslash.
The string is not modified unless the DT_MODIFYSTRING flag is specified. Compare with DT_END_ELLIPSIS and DT_WORD_ELLIPSIS. |
| DT_END_ELLIPSIS | &H8000& | For displayed text, if the end of a string does not
fit in the rectangle, it is truncated and ellipses are added. If a word that is
not at the end of the string goes beyond the limits of the rectangle, it is
truncated without ellipses.
The string is not modified unless the DT_MODIFYSTRING flag is specified. Compare with DT_PATH_ELLIPSIS and DT_WORD_ELLIPSIS. |
| DT_MODIFYSTRING | &H10000& | Modifies the specified string to match the displayed text. This value has no effect unless DT_END_ELLIPSIS or DT_PATH_ELLIPSIS is specified. |
| DT_RTLREADING | &H20000& | Layout in right-to-left reading order for bi-directional text when the font selected into the hdc is a Hebrew or Arabic font. The default reading order for all text is left-to-right. |
| DT_WORD_ELLIPSIS | &H40000& | Truncates any word that does not fit in the
rectangle and adds ellipses.
Compare with DT_END_ELLIPSIS and DT_PATH_ELLIPSIS. |
| DT_NOFULLWIDTHCHARBREAK | &H80000& | Windows 98/Me, Windows 2000/XP: Prevents a line break at a DBCS (double-wide character string), so that the line breaking rule is equivalent to SBCS strings. For example, this can be used in Korean windows, for more readability of icon labels. This value has no effect unless DT_WORDBREAK is specified. |
| DT_HIDEPREFIX | &H100000& | Windows 2000/XP: Ignores the ampersand (&)
prefix character in the text. The letter that follows will not be underlined,
but other mnemonic-prefix characters are still processed. For example:
input string: "A&bc&&d" normal: "Abc&d" DT_HIDEPREFIX: "Abc&d" Compare with DT_NOPREFIX and DT_PREFIXONLY. |
| DT_PREFIXONLY | &H200000& | Windows 2000/XP: Draws only an underline at
the position of the character following the ampersand (&) prefix character. Does
not draw any other characters in the string. For example,
input string: "A&bc&&d" normal: "Abc&d" DT_PREFIXONLY: " _ " Compare with DT_HIDEPREFIX and DT_NOPREFIX. |
|
24 |
RTL(RightToLeft) provides support for languages such as Arabic, Hebrew, Urdu.
| Activation Method | Activate | Deactivate |
| Process | SetProcessDefaultLayout (LAYOUT_RTL) | SetProcessDefaultLayout (0) |
| Window | lExStyles = lExStyles Or WS_EX_LAYOUTRTL | lExStyles = lExStyles And Not WS_EX_LAYOUTRTL |
| Device Context | SetLayout(hdc, LAYOUT_RTL) | SetLayout (hdc, 0) |
There are several Extended Style Constants used to enable Window Activation:
| LTR (Default) [ ENG: Welcome |
RTL [ |
Comments |
| WS_EX_LEFT = &H0 | WS_EX_RIGHT = &H1000 | Alignment |
| WS_EX_LTRREADING = &H0 | WS_EX_RTLREADING = &H2000 | Read Direction |
| WS_EX_RIGHTSCROLLBAR = &H0 | WS_EX_LEFTSCROLLBAR = &H4000 | Vertical Scrollbar Alignment |
| WS_EX_LAYOUTRTL = &H400000 | Right to left mirroring | |
| WS_EX_NOINHERITLAYOUT = &H100000 | Disable inheritence of mirroring by children |
To test various Styles or Extended Styles put this code in a module:
Notes:
CreateWindowEx:
In most cases you should set the extended style when the control is created with CreateWindowEx.
You can also set the style later by adding or removing a style flag. Use the code above to do this.
In the case of ListView you must update both the main ListView
and the Header component styles.
There is no header component yet when the control is created with CreateWindowEx
so you must update this after you have created a header in code.
In the case of DTPicker you must also update the Calendar component.
MonthView is only one component so it can be set when the control is created with CreateWindowEx.
UserControl:
Setting RightToLeft on a UserControl does nothing on our two Platforms: Win98SE(English), and WinXP-SP1(Brazilian Portuguese) . It may in fact work if your OS is Hebrew or Arabic.
You can use pSetStylePlus in the module above and UserControl.hWnd to enable various RTL styles.
In some cases the control is correctly mirrored, i.e. ScrollBar
is on the left, Icons are aligned right. However the text is mirrored
which is not correct. This was corrected with the following code.
SetLayout hdc, IIf(bRTL, LAYOUT_RTL, 0)
Suggested reading MSDN:
| Control | Should not allow layout inheritance |
| Listview | No |
| Panel | Yes |
| Statusbar | Yes |
| Tabcontrol | Yes |
| TabPage | Yes |
| Toolbar | No |
| TreeView | No |
| Form | Yes |
| Splitter | Yes |
Activating Mirroring per Device Context
GetLayout(hdc) will return the current layout state.
| SetLayout(hdc, 0) | If the device context is not mirrored |
| SetLayout(hdc, LAYOUT_RTL) | If the device context is mirrored |
| SetLayout(hdc, LAYOUT_RTL Or LAYOUT_BITMAP) | If the device context is mirrored and if the programmer does not want the device context to have mirrored bitmaps |
| SetLayout(hdc, LAYOUT_RTL Or LAYOUT_BITMAPORIENTATIONPRESERVED) | If the device context is mirrored and if the programmer wants the device context to have mirrored bitmaps |
| Constant | Value | Description |
| LAYOUT_RTL | 1 | Right to left |
| LAYOUT_BTT | 2 | Bottom to top |
| LAYOUT_VBH | 4 | Vertical before horizontal |
| LAYOUT_ORIENTATIONMASK | 7 | RTL + BTT + VBH |
| LAYOUT_BITMAPORIENTATIONPRESERVED | 8 |
Microsoft Visual Basic includes standard features to create and run Windows applications with full bi-directional language functionality. However, these features are operational only when Microsoft Visual Basic is installed in a bi-directional 32-bit Microsoft Windows environment, such as Arabic Microsoft Windows 95. Other bi-directional 32-bit Microsoft Windows environments are available as well.
The Right-To-Left property has been added to forms, controls, and
other Visual Basic objects to provide an easy mechanism for creating objects
with bi-directional characteristics.
Although Right-To-Left is a part of every Microsoft Visual Basic
installation, it is operational only when Microsoft Visual Basic is installed in
a bi-directional 32-bit Microsoft Windows environment.
This is frustrating since we may be
developing on a non-BIDI environment but yet want to support BIDI.
Using WS_EX_LAYOUTRTL would seem the logical answer here but when you try to
manipulate a UserControl using this Flag it does not work correctly on a
non-BIDI OS.
The trick is to not bind the RightToLeft Property to the UserControl and
owner-draw the control to replicate both BIDI and normal modes.
Thus LTR would use something like "DT_LEFT" and RTL would use "DT_RIGHT Or
DT_RTLREADING".
DT_SINGLELINE or DT_WORDBREAK and other options would be added to both
alignments.
Here is what Microsoft has to say about "MiddleEast - Frequently Asked Questions"
http://www.microsoft.com/middleeast/msdn/faq.aspx
The part about installing Arabic Support to Win2000/XP/Vista is True.
You can take with a grain of salt these statements:
| Microsoft | CyberActiveX |
| Although Right-To-Left is a part of every Microsoft Visual Basic installation, it is operational only when Microsoft Visual Basic is installed in a bi-directional 32-bit Microsoft Windows environment. | While you can't set this Property in a UserControl unless you are in a BIDI environment you can Owner-Draw to emulate RTL or set style to include WS_EX_LAYOUTRTL. |
| On Windows XP: "Regional and Language Options" go to the "Advanced" tab and change the "Language for non Unicode programs" to an Arabic language. | This is not always necessary and makes development a pain on U.S. systems. This might be true for SBCS but not for Unicode or when you are developing you own controls. |
|
25 |
Just so you do not have to reinvent the wheel check out
these hints:
|
26 |
Note:
|
27 |
GB18030–2000 is a new Chinese character encoding standard. The standard contains many characters and has some tough new conformance requirements. GB18030-2000 encodes characters in sequences of one, two, or four bytes. These sequences are defined as follows:
The government has said that it will enforce the standard first for platforms, and later for tools and applications.
| Product | GB-18030 Support |
Approval Status |
| Windows XP | Yes (with add-on) | Approved by the Chinese testing agency. |
| Windows 2000 | Yes (with add-on) | Approved by the Chinese testing agency. |
| Windows NT4 | No | Exempt from the law because released before the standard was published. |
| Windows 98 | No | Exempt from the law because released before the standard was published. |
| Windows 95 | No | Exempt from the law because released before the standard was published. |
| Windows Millennium | No | Does not conform and may no longer be sold in China. |
| Internet Explorer 6.0 | Yes | ?? |
| Internet Explore 5.5 and earlier | No | ?? |
GB18030 Support Package (English)
|
File Name: |
GBEXTSUP.msi |
| Download Size: | 7,015,936 bytes |
|
Date Published: |
11-oct-2003 19:00 |
| Version: | 1.0 |
| Download Link: | Download |
| Contents: |
*
According to Dr. International:
http://www.microsoft.com/globaldev/drintl/columns/015/default.mspx#Q13
|
Also see Surrogate pairs.
|
28 |
When you save strings to the PropertyBag in User Controls they
are automatically converted to ANSI. You can however save Unicode as a string of
Hex values of the wide characters(AscW) and reconstruct the string upon
retrieval.
A better(easier/faster) method to prevent ANSI conversion is to convert the
string to a byte array and then to a variant before writing to the PropertyBag.
The reverse process is used to read from the PropertyBag. Note that it is not
necessary to dimension the byte array.
PropertyBag Wrapper Functions
An example of how to use these functions:
| Operation | Sample code |
|---|---|
| Default Property Value | Const m_def_Caption = "" |
| Property Variable | Dim m_Caption As String |
| Property Get | Public Property Get Caption() As String Caption = m_Caption End Property |
| Property Let | Public Property Let Caption(ByVal New_Caption As String) m_Caption = New_Caption DrawControl PropertyChanged s_Caption End Property |
| Read | m_Caption = VarToUni(PropBag.ReadProperty("UniCaption", m_def_Caption)) |
| Write | PropBag.WriteProperty "UniCaption", UniToVar(m_Caption), m_def_Caption |
This preserves the Unicode in string m_Caption. In an OwnerDraw label control you would then do something like this:
DrawTextW hdc, StrPtr(m_Caption), -1, rct, m_Alignment
Note: The problem occurs only with writing so you can use the wrapper UniToVar to corrrectly save the Unicode and simply read it back using: m_Caption = PropBag.ReadProperty("Caption", m_def_Caption). Vb will automatically typecast the byte array to a string.
Note2: To input the Unicode Caption/Text in the IDE will require you to create a Propery Page that has a Unicode aware TextBox or RichEdit.
|
29 |
This is a method that allows you to input and store Unicode string captions/text without the need for a PropertyPage for Caption/Text or the need for persisting the caption as a ByteArray in the PropertyBag.
Using UniConverter.Zip(10Kb) convert your strings to UTF8 and simply paste the UTF8 string into the IDE Properties Window Caption/Text Property. You could also set them at runtime via code, resource file, database, or text file:
| Unicode String Desired | UTF-8 Equivalent |
| CHS: 欢迎 | CHS: 欢迎 |
In your UserControl convert the UTF8 to Unicode and render with DrawTextW or TextOutExW. The code to perform the conversion can be found here or at UniConverter.Zip(10Kb).
|
30 |
This applies to User Control Unicode Strings:
|
31 |
Several methods are available to populate a Control or String variable with Unicode.
|
Method |
Details |
| Keyboard |
|
| Pickers | Use Unicode picker available in Office(WinWord). Select the Unicode text,
copy to clipboard(Ctrl-C) and paste(Ctrl-V) into your textbox at the desired
position.
|
| Paste |
|
| String |
|
Keyboard Input Methods for WinXP
| Application | Method | Result |
| Notepad | Type Alt + then a Unicode hex value such as C866. Do not release the Alt key until you type the final hex value. The character will appear when you release the Alt key. | 졦 |
| Microsoft Word WordPad RichEdit controls Office edit boxes |
Type a Unicode hex value such as C866 then Alt-x. If you made an error type Alt-x and the hex value will reappear. Correct the value and convert again. | 졦 |
For more information see http://www.microsoft.com/globaldev/handson/dev/Unicode-KbdsonWindows.pdf
|
32 |
Using a Resource file is yet another method of storing your Unicode strings. A sample can be found on MSDN: http://msdn.microsoft.com/library/default.asp?url=/library/en-us/vcsample/html/vcconUniresSampleDemonstratesUseOfUnicodeResourceFiles.asp.
Make sure you use a Unicode aware editor. In NT/2000/XP Notepad is OK. In Win98 you will need to use WordPad or UltraEdit32.
If you create a new resource file in Notepad the first time you save the
file use "Save As" and specify the coding as Unicode. If your file has Unicode
content and you attempt to save the file as ANSI you will receive a warning that
you have Unicode content and must save as Unicode.
Compile this new file using resource compiler Rc.Exe. You may want to make a
simple batch file to do this (make sure Rc.Exe is in an available path):
BuildRes.bat
-----------------
RC UnicodeStrings.rc
Pause
Vb6 has no problem reading a Resource file that was saved in ANSI or Unicode format and LoadResString will read Unicode from the resource correctly.
|
33 |
The recommended approach for dealing with multilingual resources is the use of Satellite Resource Dlls. Information can be found at Satellite resource DLLs,
|
34 |
The Good, the Bad, and the Ugly
![]() |
![]() |
![]() |
| ForeverBlue2 Theme | Luna Theme(Silver) | Vb Button |
As you can see from the above Screenshot XP Themes give a nice appearance to controls but they may have undesirable side effects when it comes to Unicode.
If you manage to coerce a Vb ANSI control to display Unicode, you will more than likely lose the Unicode(displays as ???) when a Theme is applied. One exception to this is using a hook and calling API "SetWindowTextW" when you receive the HCBT_ACTIVATE message. This appears to work both with and without a XP Theme.
You can circumvent this behaviour by creating from scratch an API Owner-Draw control where you apply the Theme manually and not via a Manifest file. API 'DrawThemeText' can be called just like DrawTextW where you supply a RECT, Pointer to a Unicode string, and an alignment Flag.
You may also want to fix some of the anomalies(aka bugs) with XP Themes or even add new features. For example you can see from the above screenshot that the button corners are transparent, something you don't get using a Manifest. Using a Manifest a background rectangle is painted(API DrawThemeParentBackground).
If no Theme is available or activated you can drop back to a standard or flat style control and use DrawTextW to output the Unicode text.
UxTheme.Dll Patch for WinXP
Ordinarily I would never recommend patching an Operating
System DLL. However Microsoft does not allow 3rd party themes like the one shown
above(ForeverBlue2). There are lots of free themes available for download. You
can use a commercial program to apply these themes or you can apply a patch to
UxTheme.Dll. Patches are supplied for both XP or XP-SP1. I have been using both
these patches for over a year with no problems. If you are tired of the Micosoft
Luna Theme then try this. You are on your own so make backups and follow the
instructions carefully.
Now for the Link:
http://www.belchfire.net/article205.html
|
35 |
Vb menus are ANSI only.
The quick fix is to ignore Vb menus and create Unicode Owner-Draw menus at run-time. Almost all the menu source code you will find on the WEB are ANSI only but you can easily modify them to accomodate Unicode.
There are two basic types of Owner-Draw menus available:
Modifies standard Vb menus to add icons, styles, etc. These are more difficult to adapt to Unicode.
Create menus from scratch at run-time. Here you can easily assign Unicode captions and then handle the strings appropriately when you render them.
You are also responsible for responding to WM_MEASUREITEM messages to set the menu width.
Owner-Draw Menus with Source Code
| Link | Uses Vb Menus |
Unicode | Notes |
| Menu System 4.1 | |||
| CoolMenu (real icon menus) | |||
| vbAccelerator - PopupMenu - Transparent Menu Demonstration | |||
| vbAccelerator - Using the cNewMenu DLL to Create Start Menu/ICQ ... | |||
| vbAccelerator - PopupMenu - Context Menu Demonstration | |||
| Using the cNewMenu DLL to Create Start Menu/ICQ Style Pop-up ... | |||
| CoolMenu 1.3.2 | |||
| MC API MENU CODE GENERATOR ver 2.0 | |||
| CoolMenu (real icon menus) | |||
| SuperCode XP | |||
| Office XP Menus (No OCX) | |||
| LaVolpe Submenus v2 | |||
| Update 11-Aug-2001 Enhanced VbAccelerator popup menus | |||
| First VB Menu Ownerdraw Engine | |||
| HookMenu 1.4 | Subclassing Thunk | ||
| HookMenu 1.5 | Subclassing Thunk |
|
36 |
Download a simple demo of an MS Access database with Unicode
here.
Also search MSDN or http://www.trigeminal.com for more info.
MS Access database supports Unicode and you can use:
Note that your control must be Unicode aware such as MSHFlexGrid or Forms 2.0 Object Library Controls.
Question Marks and Garbage characters instead of Arabic text:
http://www.microsoft.com/middleeast/msdn/Questionmark.aspx
|
37 |
Table of Grid Controls
As you can see from the table below there are very few Grid Controls that are Unicode aware. This table was prepared by searching for keyword 'unicode' on Features page for each control.
| Control | Unicode Aware NT/2000/XP |
Unicode Aware |
Link |
Notes |
| sGrid2 | VbAccelerator sGrid2. |
Source code provided |
||
| VSFlexGrid Pro | VSFlexGrid Pro 8.0 |
A version that supports |
||
| InaGrid | InaGrid | Supports Unicode on Win9x Platforms |
||
| Exontrol ExGrid | Exontrol ExGrid | ANSI and UNICODE versions available |
||
| Objective Grid 9.02 | Rogue Wave Objective Grid for ActiveX | |||
| CyberActiveGrid | www.unisuite.com | |||
| Listview | Visual Studio | |||
| Listview | www.unisuite.com | |||
| MSFlexGrid | Visual Studio | |||
| MSHFlexGrid | Visual Studio | |||
| SCGrid | SCgrid | |||
| Janus GridEX Janus | Janus GridEx | |||
| True DBGrid Pro | True DBGrid Pro | |||
| UltraGrid | UltraGrid | |||
| Protoview Data Table(Infragistics) | DataTable | |||
| XpressQuantumGrid and Spread | XpressQuantumGrid | |||
| sGrid | www.VbAccelerator.com |
|
||
| iGrid | www.10Tec.com | |||
| EasyGrid | ||||
| DevGrid | DevGrid | |||
| FarPoint Spread 6 | FarPoint Spread 6 | |||
| Data Dynamics SharpGrid(#Grid) | SharpGrid(#Grid) | |||
| CTGrid | CTGrid | |||
| ActiveGrid | ActiveGrid | |||
| ContourCube | ContourCube | |||
| Data Widgets | Data Widgets | |||
| DataTable | ||||
| FlyTreeX | FlyTreeX | |||
| Gridwiz | Gridwiz | |||
| itGrid | www.it-partners.com | |||
| Mabry Grid/X Control | www.mabry.com/gridx/ | |||
| Rich Text Grid Control | RichText Grid Control | |||
| TList Pro | TList Pro | |||
| YDFGrid | YDFGrid |
|
38 |
In addition to the examples already provided here are some other useful source codes:
|
Convert:
|
|
![]() |
InternationalLocales.Zip(67Kb)
|
![]() |
|
![]() |
|
![]() |
|
| Function to determine if RTL languages are installed courtesy of
Keith LaVolpe.. , |
Private Declare Function GetProcAddress Lib
"kernel32.dll" (ByVal hModule As Long, ByVal lpProcName As String) As Long Private Declare Function GetModuleHandleA Lib "kernel32" (ByVal lpModuleName As String) As Long Private Declare Function IsValidLanguageGroup Lib "kernel32.dll" (ByVal LanguageGroup As Long, ByVal dwFlags As Long) As Long Public Function IsRTLCapable() As Boolean Const LGRPID_ARABIC As Long = &HD Const LGRPID_HEBREW As Long = &HC Const LGRPID_INSTALLED As Long = &H1 If GetProcAddress(GetModuleHandleA("kernel32.dll"), "IsValidLanguageGroup") Then If IsValidLanguageGroup(LGRPID_ARABIC, LGRPID_INSTALLED) Then IsRTLCapable = True Else IsRTLCapable = IsValidLanguageGroup(LGRPID_HEBREW, LGRPID_INSTALLED) End If End If End Function |
| Function to determine if Far East languages are installed (adapted from code by Keith LaVolpe). |
Private Declare Function GetProcAddress Lib
"kernel32.dll" (ByVal hModule As Long, ByVal lpProcName As String) As Long Private Declare Function GetModuleHandleA Lib "kernel32" (ByVal lpModuleName As String) As Long Private Declare Function IsValidLanguageGroup Lib "kernel32.dll" (ByVal LanguageGroup As Long, ByVal dwFlags As Long) As Long Public Function
IsFarEastCapable() As Boolean |
|
39 |
You can find various functions on the WEB that attempt to identify a string as Unicode:
| Function | Author | Comments |
| Function IsUnicode(ByVal s As
String) As Boolean IsUnicode = Len(s) = LenB(s) End Function |
MSDN ?? | |
|
Public Function IsUnicodeStr(s As
String) As Boolean |
MSDN ?? | This should work so there is probably something wrong with the implementation. |
| Public Function IsUtf16(ByVal
s As String) As Boolean Dim i As Long Dim lLen As Long Dim lAscW As Long lLen = Len(s) For i = 1 To lLen lAscW = AscW(Mid$(s, i)) If lAscW < 0 Then lAscW = lAscW + 65536 End If If (lAscW > 255) Then IsUtf16 = True Exit Function End If Next End Function |
UniSuite |
|
| 'Purpose:Returns True if
string has a Unicode char. Public Function IsUnicode(s As String) As Boolean Dim i As Long Dim bLen As Long Dim Map() As Byte If LenB(s) Then Map = s bLen = UBound(Map) For i = 1 To bLen Step 2 If (Map(i) > 0) Then IsUnicode = True Exit Function End If Next End If End Function |
UniSuite |
No Mid$ or AscW. |
|
40 |
These are visibly displayed graphic characters, not invisible composition controls.
We are in the process of replacing the bitmaps with the actual characters if we can find them(Example of IDS, IDS Example Represents Char). We would appreciate any help from native Chinese speakers.
| Hex | Char | Font Code 2000 |
IDEOGRAPHIC |
Nº DC |
Example of IDS | Example of IDS Bitmap |
IDS Example Represents Char |
IDS Example Represents Bitmap |
| 2FF0 | ⿰ | ⿰ | CHARACTER LEFT TO RIGHT | 2 | ⿰ | ![]() |
||
| 2FF1 | ⿱ | ⿱ | ABOVE TO BELOW | 2 | ⿱ 天 | ![]() |
||
| 2FF2 | ⿲ | ⿲ |
LEFT TO MIDDLE AND RIGHT |
3 | ⿲ | ![]() |
||
| 2FF3 | ⿳ | ⿳ |
ABOVE TO MIDDLE AND BELOW |
3 | ⿳从从日 | OK |
||
| 2FF4 | ⿴ | ⿴ | FULL SURROUND | 2 | ⿴口 | ![]() |
||
| 2FF5 | ⿵ | ⿵ |
SURROUND FROM ABOVE |
2 | ⿵門 | ![]() |
||
| 2FF6 | ⿶ | ⿶ |
SURROUND FROM BELOW |
2 | ⿶凵 土 | OK![]() |
![]() |
|
| 2FF7 | ⿷ | ⿷ | SURROUND FROM LEFT | 2 | ⿷匚 | ![]() |
||
| 2FF8 | ⿸ | ⿸ |
SURROUND FROM UPPER LEFT |
2 | ⿸广 | ![]() |
||
| 2FF9 | ⿹ | ⿹ |
SURROUND FROM UPPER RIGHT |
2 | ⿹ | ![]() |
||
| 2FFA | ⿺ | ⿺ |
SURROUND FROM LOWER LEFT |
2 | ⿺ | ![]() |
![]() |
|
| 2FFB | ⿻ | ⿻ | OVERLAID | 2 | ⿻从工 | OK![]() |
OK 巫 | |
|
41 |
Supplementary Characters (Surrogate pairs)
There are two terms that describe the concept of surrogates more accurately:
Surrogates or surrogate characters are misnomers because encoded characters cannot have a surrogate code point.
For more information, visit the Unicode Consortium's website, http://www.unicode.org.
Examples:
| Char | Vb String | Codepoint | Unicode Block | Font |
| 𠁎 | ChrW$(&HD840) & ChrW$(&HDC4E) | 2004E | CJK Unified Ideographs Extension B | DFSongStd |
In order to see Surrogate Pairs you must have a font which contains the characters you desire to render:
|
'An example of a surrogate character from CJK extension B is
According to Dr. International:http://www.microsoft.com/globaldev/drintl/columns/015/default.mspx#Q13 The SimSun18030 font (simsun18030.ttc) in the GB18030 support package does not have the complete GB18030 repertoire. It is missing the entire CJK Extension B set, as well as other characters. It does have, however, CJK Extension A. A font that contains Simplified Chinese glyphs from both CJK Extension A and B sets is "SimSun (Founder Extended)" (SurSong.ttf in the system), or 宋体–方正超大字符集 (in Chinese). It is currently available in the Simplified Chinese (CHS) version of Office XP, or the Microsoft Office Proofing Tools. Click the link for more information and how to buy. |
Here are several Functions for working with Surrogate Pairs::
Download modSurrogate:
| Function |
| Public Function CodepointToSurrogatePair(ByVal lChar As Long) As String |
| Public Function ShowHexW(ByVal s As String) As String |
| Public Function SurrogatePairToCodepoint(HiSurrogate As Integer, LoSurrogate As Integer) As Long |
| Public Function IsHiSurrogate(ch As Integer) As Boolean |
| Public Function IsLoSurrogate(ch As Integer) As Boolean |
| Public Function UnsignedToInteger(Value As Long) As Integer |
| Public Function IntegerToUnsigned(Value As Integer) As Long |
72 8 Aug 03 The Unicode Standard 4.0
3.8 Surrogates
D25 High-surrogate code point:
A Unicode code point in the range U+D800 to U+DBFF.
D25a High-surrogate code unit:
A 16-bit code unit in the range D80016 to DBFF16,
used in UTF-16 as the leading code unit of a surrogate pair.
D26 Low-surrogate code point: A Unicode
code point in the range U+DC00 to U+DFFF.
D26a Low-surrogate code unit:
A 16-bit code unit in the range DC0016 to DFFF16,
used in UTF-16 as the trailing code unit of a surrogate pair.
• High-surrogate and low-surrogate code points are designated only for that use.
Conformance 3.9 Unicode Encoding Forms
• High-surrogate and low-surrogate code units are used only in the context of the UTF-16 character encoding form.
D27 Surrogate pair: A representation for a single abstract character that consists of a sequence of two 16-bit code units, where the first value of the pair is a high-surrogate code unit, and the second is a low-surrogate code unit.
• Surrogate pairs are used only in UTF-16. (See Section 3.9, Unicode Encoding Forms.)
• Isolated surrogate code units have no interpretation on their own. Certain other isolated code units in other encoding forms also have no interpretation on their own. For example, the isolated byte 8016 has no interpretation in UTF-8; it can only be used as part of a multibyte sequence. (See Table 3-6.)
• Sometimes high-surrogate code units are referred to as leading surrogates. Low-surrogate code units are then referred to as trailing surrogates. This is analogous to usage in UTF-8, which has leading bytes and trailing bytes.
• For more information, see Section 15.5, Surrogates Area, and Section 5.4, Handling Surrogate Pairs in UTF-16.
|
42 |
There is a common misconception that VbDate is a string. This is probably due to the fact that when you hover over a VbDate in the IDE you will see a formatted date(short date format). It is actually a double in disguise where the integer portion represents the date and the fractional(decimal) portion is the time of day. If a VbDate is applied to a label caption or textbox it will display the date as short date as defined in your regional configurations.
You can coerce a VbDate into a double by using code such as "Cdbl(Now)"
With all the Vb hacks needed to make things work in Unicode it is a pleasant surprise that Format$ with a date argument and FormatDateTime$ are Unicode aware. The problem lies in displaying these string values so you must use a Unicode aware control. If you don't have a Unicode control handy and you are using NT or later then you can simply use a Vb PictureBox and DrawTextW to display the results.
Sample code:
| m_Text= "Hello World" picCanvas.Refresh Private Sub picCanvas_Paint() |
MSDN: Visual Basic 6.0
Date variables are stored as IEEE 64-bit (8-byte) floating-point numbers that represent dates ranging from 1 January 100 to 31 December 9999 and times from 0:00:00 to 23:59:59. Any recognizable literal date values can be assigned to Date variables. Date literals must be enclosed within number signs (#), for example, #January 1, 1993# or #1 Jan 93#.Date variables display dates according to the short date format recognized by your computer. Times display according to the time format (either 12-hour or 24-hour) recognized by your computer.
When other numeric types are converted to Date, values to the left of the decimal represent date information while values to the right of the decimal represent time. Midnight is 0 and midday is 0.5. Negative whole numbers represent dates before 30 December 1899.
|
43 |
Here are several methods of Reading and Writing Unicode Files:
Download modUnicodeRW:
| Method | Operation | Function |
| VB | Read | Public Function UnicodeFile_Read_VB(ByVal sFileName As String, Optional ByVal bRemoveBOM As Boolean) As String |
| Write | Public Sub UnicodeFile_Write_VB(ByVal sFileName As String, ByVal sText As String, Optional ByVal bInsertBOM As Boolean) | |
| FSO(Scripting) | Read | Public Function UnicodeFile_Read_FSO( ByVal sFileName As String, Optional ByVal TriState As TristateEnum = TristateTrue) As StringPublic Function UnicodeFile_Read_API(ByVal sFileName As String) As String |
| Write | Public Sub UnicodeFile_Write_FSO( ByVal sFileName As String, ByVal sText As String, Optional ByVal ForWrite As ForWriteEnum = ForWriting, Optional ByVal TriState As TristateEnum = TristateTrue) | |
| API | Read | Public Function UnicodeFile_Read_API(ByVal sFileName As String) As String |
| Write | Public Function UnicodeFile_Write_API(ByVal sFileName As String, ByVal sText As String) As Boolean |
Sometimes your file is saved in the old DOS 8.3 covention, aka shortname. This
is frustrating and you certainly wouldn't want to display a short filename to a
user. The solution is API which can convert to and from short/long pathnames.
Download modPathName:
| Method | Function |
| API | Public Function GetShortName(ByVal sLongFileName As String) As String |
| Public Function GetLongName(ByVal sShortFileName As String) As String |
|
44 |
Due to several requests on various Forums here are some basic instructions on how to enable Unicode in VbAccelerator ListView/TreeView controls. Note that this is just the basics to enable Unicode in LV cells and TV items. It does not include the extra support for LV Header, Column Sorting, or RightToLeft.
|
ListView Set Conditional compile to "UNICODE = -1" |
||
| Line | Old | New |
| 529 | .pszText = sText | .pszText = StrPtr(sText) |
| 547 | .sText = sText | .sText = StrPtr(sText) |
| 1379 | m_tLV.pszText = sBuf | m_tLV.pszText = StrPtr(sBuf) |
| 1422 | m_tLV.pszText = sBuf | m_tLV.pszText = StrPtr(sBuf) |
| 1425 | LSet m_tLV.pszText = sText & Chr$(0) | m_tLV.pszText = StrPtr(sText) |
| 1785 | tLV.pszText = sText & vbNullChar | tLV.pszText = StrPtr(sText) |
| 1937 | tLBI.pszImage = sURL & Chr$(0) | tLBI.pszImage = StrPtr(sURL) |
|
TreeView |
|
| CHANGE | DrawTextA to DrawTextW. Use StrPtr() in DrawTextW calls. |
| SUBSTITUTE | TVM_INSERTITEM with TVM_INSERTITEMW |
| ADD | Private Const TVM_INSERTITEMW = (TV_FIRST + 50) |
| ADD | Private Type TVITEMEXW mask As Long hItem As Long State As Long stateMask As Long pszText As Long cchTextMax As Long iImage As Long iSelectedImage As Long cChildren As Long lParam As Long iIntegral As Long End Type |
| CHANGE | Private Type TVINSERTSTRUCT hParent As Long hInsertAfter As Long Item As TVITEMEXW End Type |
| CHANGE | TVIN.Item.pszText = StrPtr(sText) |
| CHANGE | in pInitialize use StrPtr() for strings. |
|
45 |
You can save registry files either in text or Unicode format, but versions of Windows earlier than Win2K can't recognize Unicode files, so for those OS versions, you need to save the files in text format. In XP and Win2K, the registry editor is version 5, which you'll find specified in the first line of the exported registry file, which reads "Windows Registry Editor Version 5.00." Changing this header line (you can use a text editor such as notepad.exe to do so) to read REGEDIT4 will make the file compatible with earlier Windows OSs.
Since registry entries(Keys, Values) under Windows Registry 5.0 format can be Unicode you need to use the Wide versions of AdvApi32.DLL to manipulate keys and values. If you want to make a RegEdit clone you will also need Unicode aware versions of TreeView/ListView to render Unicode keys and values.
This snippet shows how one would implement registry functions that can run on all Platforms. We create a "Public Function RegOpenKeyEx" and then call the ANSI or Wide API versions as appropriate.
Private Declare Function
RegOpenKeyExA Lib "advapi32" (ByVal hkey As Long, ByVal lpSubKey As String,
ByVal ulOptions As Long, ByVal samDesired As Long, phkResult As Long) As Long
Private Declare Function RegOpenKeyExW Lib "advapi32" (ByVal hkey As Long, ByVal
lpSubKey As Long, ByVal ulOptions As Long, ByVal samDesired As Long, phkResult
As Long) As Long
'Purpose: ANSI/Unicode
wrapper.
Public Function RegOpenKeyEx(ByVal hkey As Long, ByVal lpSubKey As String, ByVal
ulOptions As Long, ByVal samDesired As Long, phkResult As Long) As Long
If Is2000 Then
RegOpenKeyEx = RegOpenKeyExW(hkey,
StrPtr(lpSubKey), ulOptions, samDesired, phkResult)
Else
RegOpenKeyEx = RegOpenKeyExA(hkey, lpSubKey,
ulOptions, samDesired, phkResult)
End If
End Function
If you plan on writing you own Registry class here is a wish list of features available in RegEdit and some commercial Registry Editors:
Import/Export ð XML.
|
46 |
"According to Microsoft" Windows 95 and 98 were built upon 3.1, and they don't support Unicode. All applications written for Windows 98 should be ANSI applications.
Actually Windows 98 has a handful of Unicode APIs, which are listed in the below table. Other than these functions, the system DLLs for Windows 98 have export symbols for wide character functions as well, but they all return FALSE, and GetLastError would return ERROR_CALL_NOT_IMPLEMENTED which is defined as 120.
|
Unicode functions implemented on Windows 98 |
|
| EnumResourceLanguagesW | GetTextExtentPoint32W |
| EnumResourceNamesW | GetTextExtentPointW |
| EnumResourceTypesW | lstrlenW |
| ExtTextOutW | MessageBoxExW |
| FindResourceW | MessageBoxW |
| FindResourceExW | TextOutW |
| GetCharWidthW | WideCharToMultiByte |
| GetCommandLineW | MultiByteToWideChar |
What is really missing in Win98
and necessary for serious work is the DrawTextW API:
Private Declare Function DrawTextW Lib "user32" (ByVal hdc As Long, ByVal lpStr
As Long, ByVal nCount As Long, lpRect As rect, ByVal wFormat As Long) As Long
To get this in Win98 you will need to write a Uniscribe Wrapper that will emulate this function. This is not an easy task since there are 24 DrawText Alignment Flags that can be combined to specify how the text is to be formatted within the bounding rectangle.
Example of how to integrate the wrapper:
Public Sub DrawTextU(ByVal sText As String,
ByRef rct as Rect, ByVal lFlags As Long)
Dim lPtr As Long
If (LenB(sText) = 0) Then Exit Sub
If (IsNT) Then
lPtr = StrPtr(sText)
If Not (lPtr = 0) Then
DrawTextW hdc, lPtr, -1, rct,
lFlags
End If
Else 'Win9x
If IsUnicode(sText) Then
'Call Uniscribe Wrapper here
Else
DrawTextA hdc, sText, -1, rct,
lFlags
End If
End If
End Sub
|
47 |

This class and demo show exactly how strings are stored in Visual Basic. They are stored as integers(0-65535). Since Vb does not have an unsigned integer we convert the VB signed Integer to a positive Long in the Demo. You can find the original submission by Chris Lucas at Planet Source Code http://www.planet-source-code.com/vb/scripts/ShowCode.asp?txtCodeId=34787&lngWId=1 The original code has been modified to use a class. You can download the complete source code at http://www.unisuite.com/download/MapString.zip.
|
48 |
A Byte Array is an array of bytes that starts at LBound(usually 0 unless you change Option Base) and ends at UBound. Here are some examples:
| String | Conversion |
ByteArray |
Comments |
| Hello | b = StrConv(s, vbFromUnicode) | H 72 e 101 l 108 l 108 0 111 |
ANSI byte array of an ANSI string. |
| Hello | b = s | H 72 0 e 101 0 l 108 0 l 108 0 o 111 0 |
Unicode byte array of an ANSI string. |
| CHS: 欢迎 | b = s | C 67 0 H 72 0 S 83 0 : 58 0 32 0 欢 34 107 迎 206 143 |
Unicode byte array of an Unicode string. |
| CHS: 欢迎 | b = StrConv(s, vbFromUnicode) | C 67 H 72 S 83 : 58 32 ? 63 ? 63 |
ANSI byte array of an Unicode string. Note that the 2 Unicode characters are now replaced by "?" |
An important note is that a Vb string ends with a null byte which is Chr$(0). A byte array ends with the last character of the string and has no null byte.
OK so Byte Arrays can also be used to load raw data such as WAV,
GIF, JPG from resources. The Byte Array is then written to a temp file and then
loaded back using LoadPicture(). In the case of WAV you can use the following
Declare to pass a byte array:
Private Declare Function PlaySoundData Lib
"winmm.dll" Alias "PlaySoundA" (lpData As Any, ByVal hModule As Long, ByVal
dwFlags As Long) As Long
A better way to eliminate the temp file is to use IStream which is addressed in the next topic.
|
49 |
If you try to use Get/Put to write Unicode strings they are changed to ANSI via the Put method.
Put appears to use a proprietary format to
write UDT data which is then readable by Get.
If you would like to study more about how this works in Vb6 see the link:
http://sandsprite.com/CodeStuff/Understanding_UDTs.html
It is also interesting that VB's Put and Get command were built to be pretty smart. We know that complex UDTs(such as "s As String" or "b() As Byte") aren't stored in memory as a continuous block, however the VB Put command is kind enough to pack the whole structure and data into a new format for us so that it is complete when dumping it to disk.
Example:
| Public Type MyType s As String i As Integer L as Long End Type Public Function UnicodeFile_Write_Vb_UDT(ByVal sFileName As
String, _ |
The solution for Unicode is simple. Just change the UDT to:
MyType
b() As Byte
i As Integer
L as Long
End Type
Example (Write): MyArray(0).b = "CHS: " & ChrW$(&H6B22) & ChrW$(&H8FCE)
Example (Read): MyString = MyArray(0).b
Now that strings are stored as a byte array they will not be corrupted and you can use Get/Put for Unicode strings in UDTs.
|
50 |
Many people shy away from IStream because they think it is too complex to implement or requires a Type Library. Neither is true. That is of course assuming you do not require the full repetoire of IStream Read/Seek, etc.
Where would you use IStream? A good example would be reading a Byte
Array from a CUSTOM resource such as Jpg or Gif. Normally you would write the
Byte Array to a temp file and then read it back in via LoadPicture().
This is
very ineffiecient for the following reasons:
Another place to use IStream is in conjunction with BLOBs from a Database. They could be strings or pictures. In any case you read the data into a Byte Array, convert to IStream, and finally pass it to a function that accepts Istream.
In the example here we are loading data from a Byte Array. This could be for example LoadResData from a CUSTOM resource such as Jpg or Gif. Almost all examples on the Web use a TLB but here we will define IStream here as IUnknown. Also note that the Declares use "As Any". Note that this sample code is used where IStream is required for input as in GDI+ CreateBitmapFromIStream. You also need to tweak the GDI+ Declares that use IStream to type IUnknown.
|
Sample Code Public Declare Function
GdipLoadImageFromStream Lib "GDIPlus" (ByVal stream As
IUnknown, Image As
Long) As GpStatus |
OK. This works where you require a stream but what about StdPicture objects. Not that much
more complicated in that
you need to prefix a picture header to the byte array and call
OleCreatePictureIndirect. CopyMemory API is used to put this together.
|
Sample Code NOTE: This uses an IStream TLB.
Public Function
Array2Picture(aBytes() As Byte) As StdPicture Public Sub
Picture2Array(ByVal oObj As StdPicture, aBytes() As Byte) |
|
51 |
These are the basics of how to make a Unicode aware calendar control. You can use the ComCtl32.Dll derived DTPicker or MonthCal but this will only display a calendar for the current LCID set in Control Panel/ Regional Settings.
To make a calendar that will work for any LCID:
Use API EnumSystemLocales with Flag LCID_INSTALLED to build a collection of installed Locales for the users Platform. These can then be added to a ListBox.
Add a Unicode aware ComboBox for MonthName selection and an UpDown Control for Year selection.
Select an LCID for the calendar.
Populate MonthName Combo using API GetLocaleInfoW.
Owner draw the calendar using the current selected Year/Month.
Note that Vb supports both Gregorian and Hijri Calendars via VBA.Calendar = vbCalGreg or VBA.Calendar = vbCalHijri

|
52 |
http://www.unicode.org/ describes the "Unicode Collation Algorithm" at UTS #10- Unicode Collation Algorithm, and this link provides further information http://www.answers.com/topic/unicode-collation-algorithm, however Microsoft does not use the "Unicode Collation Algorithm" as indicated at http://blogs.msdn.com/michkap/archive/2004/11/28/271121.aspx.
In Visual Basic many sort algorithms are implemented using comparisons like "If String1 < String2 Then"(it is unclear what this actually does, Unicode or at worse just the low byte of Unicode character).
| StrComp Mode | Usage |
| StrComp(S1,S2,vbBinaryCompare) | Compares string S1 and S2 based on their Unicode values. |
| StrComp(S1,S2,vbTextCompare) | Compares string S1 and S2 in a locale-dependent, case-insensitive way. |
As it turns out Japanese for example will be sorted correctly (http://blogs.msdn.com/michkap/archive/2004/12/27/332618.aspx) using StringCompareW even if you do not pass the Japanese LCID, 1041. "The world is in the proper アアあイイいウウうエエえオオお order (in the traditional AIUEO order, Halfwidth Katakana followed by Fullwidth Katakana followed by Hiragana). "
'Valid dwCmpFlags
Const NORM_IGNORECASE As Long = &H1 'Ignore case.
Const NORM_IGNOREKANATYPE As Long = &H40 'Do not differentiate between Hiragana
and Katakana characters. Corresponding Hiragana and Katakana characters compare
as equal.
Const NORM_IGNORENONSPACE As Long = &H2 'Ignore nonspacing characters.
Const NORM_IGNORESYMBOLS As Long = &H4 'Ignore symbols.
Const NORM_IGNOREWIDTH As Long = &H8 'Do not differentiate between a single-byte
character and the same character as a double-byte character.
Const SORT_STRINGSORT As Long = &H1000 'Treat punctuation the same as symbols.
'Default Locale
Const LOCALE_SYSTEM_DEFAULT As Long = &H800
Const LOCALE_USER_DEFAULT As Long = &H400
| Declare Function CompareStringA Lib "kernel32.dll" ( _ ByVal Locale As Long, _ ByVal dwCmpFlags As Long, _ ByVal lpString1 As String, _ ByVal cchCount1 As Long, _ ByVal lpString2 As String, _ ByVal cchCount2 As Long) As Long |
Declare Function CompareStringW Lib "kernel32.dll" ( _ ByVal Locale As Long, _ ByVal dwCmpFlags As Long, _ ByVal lpString1 As Long, _ ByVal cchCount1 As Long, _ ByVal lpString2 As Long, _ ByVal cchCount2 As Long) As Long |
|
53 |
Several subclassers are available ranging from the classic SSubTmr6 by VbAccelerator.com to several available on Planet Source Code such as SelfSub, SelfHook, SelfCallback by Paul Caton.
For the most part they work fine except when the hWnd you are subclassing is a Unicode window. You can detect this automatically in your subclassing code using API IsWindowUnicode(hWnd). If the result is True then you must use the Wide API calls or else your window will be changed to ANSI.
It may not be necessary to wrap all of the API calls (GetPropW, SetPropW, RemovePropW, GetWindowLongW, SetWindowLongW, CallWindowProcW), however the SetWindowLongW is mandatory since your window will revert to ANSI if you use SetWindowLongA.
Private Declare Function IsWindowUnicode Lib "user32" (ByVal hwnd As Long) As Long
|
54 |
Keith LaVolpe was kind enough to provide this code to
Get Unicode Filenames via DragDrop or Paste(Clipboard).
Here is the code. You can also download the zipped project file
here:
|
55 |
If you have already tried to print Unicode using the Vb Printer Object you already know that it is ANSI only.
Here are several options:
Write a Unicode aware Printer Object from scratch.
Use Vb Printer.hDc and Print from a Unicode aware RichEdit control using PrintRange.
Borrow the Vb Printer.hDc and write Unicode to the hDc using TextOutW or DrawTextW. You are responsible for the layout and positioning of the text and/or images.
|
56 |

Internationalization with Visual Basic by Michael S. Kaplan.
Copyright© 2000 by Sams Publishing.
ISBN: 0-672-31977-2
First Printing: September 2000
The companion CD contains ≈ 100 MB of Vb sample code including:
| Wrapper for using Mlang in Visual Basic. |
Wrapper for using Uniscribe in Visual Basic. Includes:
|
| Convert a Unicode (UCS-2) string to a multibyte string(ANSI - DBCS). |
| Converts a multibyte string to a Unicode (UCS-2) string. |
Keep in mind that this was written before Windows XP so some tricks that worked back when the book was written do not work correctly when XP Themes are activated.
Although it is currently out of print you can still find a new or used copy from several on-line bookstores.
|
57 |
| Chilkat Software | http://www.example-code.com/vb/unicode.asp |
Unicode Visual Basic Examples |
| Unicows.DLL Download | http://www.microsoft.com/data. | Microsoft Layer for Unicode Technology (Unicows.DLL) |
| GDIplus.DLL Download
|
gdiplus_dnld.exe |
|
| General Unicode | http://www.unicode.org | |
| * Unicode Conversions in Vb6 | http://www.vovisoft.com/unicode/UniFunctions.htm | Unicode Conversion |
| * Zip Download | http://www.vovisoft.com/unicode/UniTextInOut.zip | Vb6 Unicode Conversion source code |
| Owner/Custom Draw Unicode ActiveX Grid | http://www.unisuite.com | Screenshots, Docs, Features. |
| COMMCTRL.TLB | www.domaindlx.com/e_morcillo/ | comctl32.dll 5.8 Type Library |
| UTF | http://czyborra.com/utf/ | Unicode Transformation Formats |
| MSLU Newsgroup | microsoft.public.platformsdk.mslayerforunicode | Info about MS Layer for Unicode |
| MSLU: reported bugs and known issues | http://trigeminal.com/usenet/usenet035.asp | Maintained by an employee of Microsoft who is the principal developer for MSLU |
| Sample code and other info | http://www.trigeminal.com | "Internationalization with Visual Basic" by Michael S. Kaplan |
| DrInternational Columns | http://www.microsoft.com/globaldev/DrIntl/columns/default.mspx | |
| World-Ready Software Example | http://www.microsoft.com/globaldev/tools/wrapp.mspx | C++ |
* UTF16 to UTF8 works(although not optimized) and UTF8 to UTF16 is not 100% correct so I suggest using the modified versions here.
|
58 |
The most common problem with Unicode controls is the infamous display of "??" or boxes "▯▯" . It can usually be traced to one or more of the following areas:
The control is not Unicode aware. This applies to all controls supplied with Vb6 with exception of MSHFLXGD.OCX. The reason for this is that Vb controls were built using ANSI API Functions/Structures/Constants. In addition many controls that claim to be Unicode aware may require that you are running at least NT Platform or above(check the manufacturers specs).
The Font you are using does not support the glyphs that you are attmpting to display.
| Platform | Font Fallback Available |
Comments |
| Win98/ME/NT | Use Arial Unicode MS | |
| Win2000/WinXP | Can use Arial or Tahoma * |
* Note: MS Sans Serif does not appear to support Font Fallback.
| DBCS character that represents a Japanese wide-width "A" | &H82 &H60 | |
| Unicode wide-width "A" (Full width Latin Capital Letter A) | &hFF21 | A |
|
59 |
This has come up several times in various Forums. Running Vb6 on
non-English machines may seem confusing to U.S. English users.
Is it Unicode? No, it is MBCS in a Vb6 ANSI code window.
On Chinese machine you see: CHS: 欢迎 in Vb6 ANSI code window.
On U.S. machine you see: CHS: »¶Ó in Vb6 ANSI code window. This looks like junk
but it is MBCS equivalent of CHS: 欢迎 using codepage 936.
On Greek machine you see Clipboard.SetText "αβγ" in Vb6 ANSI
code window.
On U.S. machine you see Clipboard.SetText "áâã" in Vb6 ANSI code window. This
looks like junk but it is MBCS equivalent of "αβγ" using codepage 1253.
| How is that a Chinsese or Greek machine display and set strings in the Vb IDE window in their language? | The Vb IDE is ANSI but setting the Font.charset to Chinese allows Chinese users to type in Chinese. This is not Unicode though, it is MBCS. If you download a Vb project from a Chinese site and run it on a U.S. machine you will see comments and strings as unreadable bytes. These bytes though are perfectly good MBCS. If you were to reboot you machine in Chinese(non-Unicode program language default) you can now see readable Chinese. | |||||||||
| What happens to data sent to the clipboard using Vb clipboard object when using a Chinsese or Greek machine? | It appears as though the MBCS strings in program are sent to
the clipboard as follows:
|
Here are several links that explain this somewhat confusing issue:
| Title | Link |
| Display Unicode Strings in Visual Basic 6.0 | http://www.example-code.com/vb/vbUnicode1.asp |
| Display Japanese in VB6 on Any Computer Regardless of Locale | http://www.example-code.com/vb/vbUnicode1.asp |
| VB, VBLM & Unicode | http://www.whippleware.com/VBLM/webhelp/vblm/vb__vblm___unicode.htm |
|
60 |
The easiesr way to get FarEast support is to set Regional Config - Language for non-Unicode programs) to your desired language,
Here is one way to get Far East support in Vb6 IDE without rebooting into amother language .
NJStar Communicator will
allow all ANSI programs (not just Vb6) to support Chinese and other Far East
languages.
The second Debug.Print "CHS: 欢迎" statement below was actually pasted from a
Unicode string in a Notepad file.
Also note Caption Property in Proprties Window as "NJSTAR CHS: 欢迎"
Priced as US$ 99 this might be useful for using a single Far East language in
conjunction with English application.
Note that this is NOT Unicode since the Vb Wndows are ANSI.

|
61 |
Your running XP but your control looks like this when displaying Unicode:

Solution. You need to install the Optional Far East and
RightToLeft languages.
On XP go to Control Panel, Regional and Language Settings, Language Tab.
Check both boxes for Suplemental Language Support.

You wil be prompted to insert your XP installation CD to update your system to include this support.
|
62 |
APIs are usually defined something like this:
Private Declare Function DrawTextA Lib "user32.dll" (ByVal hdc As
Long, ByVal lpStr As String, ByVal nCount As Long, ByRef lpRect As RECT,
ByVal wFormat As Long) As Long
Private Declare Function DrawTextW Lib "user32.dll" (ByVal hdc As
Long, ByVal lpStr As String, ByVal nCount As Long, ByRef lpRect As RECT,
ByVal wFormat As Long) As Long
When calling the above DrawTextW the string is converted to ANSI.
Knowing that Vb will convert the Unicode string when calling the
API we convert the already Vb Unicode string to "DoubleUnicode" via
StrConv.
When the API is called the "DoubleUnicode" is converted to
"Unicode".
While this method works it is inefficient since it involves 2 conversions and
should probably be considered as a unnecessary hack.
sUni = "CHS: " & ChrW$(&H6B22) & ChrW$(&H8FCE)
DrawTextW Me.hdc, StrConv(sUni, vbUnicode), -1, rct, 0
If API DrawTextW was declared inside a Type Library(TLB) then we
could call it directly like this:
DrawTextW Me.hdc, sUni,, -1, rct, 0
The best way to do this in Vb6 when the API is
declared in your application is the just change the Wide API String to Long and
use StrPtr:
Private Declare Function DrawTextW Lib "user32.dll" (ByVal hdc As
Long, ByVal lpStr As Long, ByVal nCount As Long, ByRef lpRect As RECT,
ByVal wFormat As Long) As Long
DrawTextW Me.hdc, StrPtr(sUni),, -1, rct, 0
To make matters worse, wide API with a String
parameter and usage of StrConv will return garbage in lieu of correct Unicode on
a Japanese OS.. You can reproduce this error by setting your U.S. English OS to
Japanese for non-Unicode programs via Regional Configurations and restarting
your computer.
Run this sample code to see the error with GetLocaleInfoW:
| This will work: Private Declare Function GetLocaleInfoW Lib "kernel32" (ByVal Locale As Long, ByVal LCType As Long, ByVal lpData As Long, ByVal cchData As Long) As Long |
| This will not: Private Declare Function GetLocaleInfoW Lib "kernel32" (ByVal Locale As Long, ByVal LCType As Long, ByVal lpData As String, ByVal cchData As Long) As Long |
|
63 |
Often we need a quick solution for displaying a Unicode string
without using a Unicode aware control.
Vb MsgBox or Debug.Print will not display Unicode but this ShellMsgBox will.
If for some reason you don't see Unicode using ShellMsgBox you
may be missing FarEast, complex script, or RTL support.
See Where's the Beef.
|
64 |
It's hard enough just to make controls with full Unicode support
on NT/2000/XP.
For Win98 you have the following options:
Convert Unicode to SBCS/DBCS and set the appropriate codepage. Unfortunately this doesn't allow you to handle multiple languages unless the Font supports all languages in the string.
Render Unicode using Uniscribe. You will need a wrapper for Uniscribe Usp10.Dll API Functions to do this.
Even if you have the system requirements for VB.NET you may not want to go through the learning curve to come up to speed with the newer language. In addition, applications developed with .NET will require the .NET Framework to be installed on the target system.
If you really love Vb6 and would like to help keep it alive then sign the petition at http://classicvb.org/petition/.
As of 28-Jan-2010 there are 14290 signatories including 265 Microsoft MVPs since March 8th, 2005.
![]()