久久一区二区三区超碰国产精品,亚洲人成在线网站,国产在线精品一区免费香蕉,国产精品免费电影

歡迎您訪問(wèn)vb uft-8轉(zhuǎn)gb2312!

vb uft-8轉(zhuǎn)gb2312

更新時(shí)間:2021-05-28 22:15:21作者:admin2

下面的內(nèi)容轉(zhuǎn)自我的百度空間,是我收集來(lái)的,在這里看起來(lái)如果覺(jué)得排版不好,可以直接看我的空間內(nèi)的文章:http://hi.baidu.com/newkedison/blog/item/1c7d2c392cc192f63b87ce12.html有關(guān)UTF-8的一些資料2008年06月13日 星期五 08:17一, 最重要的,UTF-8和Unicode的轉(zhuǎn)換UTF-8 編碼是一種被廣泛應(yīng)用的編碼,這種編碼致力于把全球的語(yǔ)言納入一個(gè)統(tǒng)一的編碼,目前已經(jīng)將幾種亞洲語(yǔ)言納入。UTF 代表 UCS Transformation Format. UTF-8 采用變長(zhǎng)度字節(jié)來(lái)表示字符,理論上最多可以到 6 個(gè)字節(jié)長(zhǎng)度。UTF-8 編碼兼容了 ASC II(0-127), 也就是說(shuō) UTF-8 對(duì)于 ASC II 字符的編碼是和 ASC II 一樣的。對(duì)于超過(guò)一個(gè)字節(jié)長(zhǎng)度的字符,才用以下編碼規(guī)范: 左邊第一個(gè)字節(jié)1的個(gè)數(shù)表示這個(gè)字符編碼字節(jié)的位數(shù),例如兩位字節(jié)字符編碼樣式為為:110xxxxx 10xxxxxx; 三位字節(jié)字符的編碼樣式為:1110xxxx 10xxxxxx 10xxxxxx.;以此類(lèi)推,六位字節(jié)字符的編碼樣式為:1111110x 10xxxxxx 10xxxxxx 10xxxxxx 10xxxxxx 10xxxxxx。 xxx 的值由字符編碼的二進(jìn)制表示的位填入。只用最短的那個(gè)足夠表達(dá)一個(gè)字符編碼的多字節(jié)串。例如: Unicode 字符: 00 A9(版權(quán)符號(hào)) = 1010 1001, UTF-8 編碼為:11000010 10101001 = 0x C2 0xA9; 字符 22 60 (不等于符號(hào)) = 0010 0010 0110 0000, UTF-8 編碼為:11100010 10001001 10100000 = 0xE2 0x89 0xA0以上轉(zhuǎn)換例子已經(jīng)確認(rèn)是正確的,不用懷疑,如果看不懂請(qǐng)?jiān)僮屑?xì)想想U(xiǎn)nicode編碼和utf-8編碼之間的對(duì)應(yīng)關(guān)系表 The table below summarizes the format of these different octet types. The letter x indicates bits available for encoding bits of the character number.Char. number range | UTF-8 octet sequence (hexadecimal) | (binary) --------------------+--------------------------------------------- 0000 0000-0000 007F | 0xxxxxxx 0000 0080-0000 07FF | 110xxxxx 10xxxxxx 0000 0800-0000 FFFF | 1110xxxx 10xxxxxx 10xxxxxx //////A///////// 0001 0000-0010 FFFF | 11110xxx 10xxxxxx 10xxxxxx 10xxxxxx這是一個(gè)Unicode編碼和utf-8編碼之間的對(duì)應(yīng)關(guān)系表。中文的Unicode編碼范圍在0000 0800-0000 FFFF 中。二, 關(guān)于BOMUTF-8以字節(jié)為編碼單元,沒(méi)有字節(jié)序的問(wèn)題。UTF-16以?xún)蓚€(gè)字節(jié)為編碼單元,在解釋一個(gè)UTF-16文本前,首先要弄清楚每個(gè)編碼單元的字節(jié)序。例如收到一個(gè)“奎”的Unicode編碼是594E,“乙”的Unicode編碼是4E59。如果我們收到UTF-16字節(jié)流“594E”,那么這是“奎”還是“乙”? Unicode規(guī)范中推薦的標(biāo)記字節(jié)順序的方法是BOM。BOM不是“Bill Of Material”的BOM表,而是Byte Order Mark。BOM是一個(gè)有點(diǎn)小聰明的想法: 在UCS編碼中有一個(gè)叫做"ZERO WIDTH NO-BREAK SPACE"的字符,它的編碼是FEFF。而FFFE在UCS中是不存在的字符,所以不應(yīng)該出現(xiàn)在實(shí)際傳輸中。UCS規(guī)范建議我們?cè)趥鬏斪止?jié)流前,先傳輸字符"ZERO WIDTH NO-BREAK SPACE"。 這樣如果接收者收到FEFF,就表明這個(gè)字節(jié)流是Big-Endian的;如果收到FFFE,就表明這個(gè)字節(jié)流是Little-Endian的。因此字符"ZERO WIDTH NO-BREAK SPACE"又被稱(chēng)作BOM。 UTF-8不需要BOM來(lái)表明字節(jié)順序,但可以用BOM來(lái)表明編碼方式。字符"ZERO WIDTH NO-BREAK SPACE"的UTF-8編碼是EF BB BF(讀者可以用我們前面介紹的編碼方法驗(yàn)證一下)。所以如果接收者收到以EF BB BF開(kāi)頭的字節(jié)流,就知道這是UTF-8編碼了。三, VB實(shí)現(xiàn)UTF-8轉(zhuǎn)Unicode的函數(shù)1.不使用APIFunction Utf8ToUnicode(ByRef Utf() As Byte) As StringDim utfLen As LongutfLen = -1On Error Resume NextutfLen = UBound(Utf)If utfLen = -1 Then Exit FunctionOn Error GoTo 0Dim i As Long, j As Long, k As Long, N As LongDim B As Byte, cnt As ByteDim Buf() As StringReDim Buf(utfLen)i = 0j = 0Do While i <= utfLen B = Utf(i) If (B And &HFC) = &HFC Then cnt = 6 ElseIf (B And &HF8) = &HF8 Then cnt = 5 ElseIf (B And &HF0) = &HF0 Then cnt = 4 ElseIf (B And &HE0) = &HE0 Then cnt = 3 ElseIf (B And &HC0) = &HC0 Then cnt = 2 Else cnt = 1 End If If i + cnt - 1 > utfLen Then Buf(j) = "?" Exit Do End If Select Case cnt Case 2 N = B And &H1F Case 3 N = B And &HF Case 4 N = B And &H7 Case 5 N = B And &H3 Case 6 N = B And &H1 Case Else Buf(j) = Chr(B) GoTo Continued: End Select For k = 1 To cnt - 1 B = Utf(i + k) N = N * &H40 + (B And &H3F) Next Buf(j) = ChrW(N)Continued: i = i + cnt j = j + 1LoopUtf8ToUnicode = Join(Buf, "")End Function2. 使用API (包括Unicode轉(zhuǎn)UTF-8)Private Declare Function WideCharToMultiByte Lib "kernel32" (ByVal CodePage As Long, ByVal dwFlags As Long, ByVal lpWideCharStr As Long, ByVal cchWideChar As Long, ByRef lpMultiByteStr As Any, ByVal cchMultiByte As Long, ByVal lpDefaultChar As String, ByVal lpUsedDefaultChar As Long) As LongPrivate Declare Function MultiByteToWideChar Lib "kernel32" (ByVal CodePage As Long, ByVal dwFlags As Long, ByVal lpMultiByteStr As Long, ByVal cchMultiByte As Long, ByVal lpWideCharStr As Long, ByVal cchWideChar As Long) As LongPrivate Const CP_UTF8 = 65001Function Utf8ToUnicode(ByRef Utf() As Byte) As StringDim lRet As LongDim lLength As LongDim lBufferSize As LonglLength = UBound(Utf) - LBound(Utf) + 1If lLength <= 0 Then Exit FunctionlBufferSize = lLength * 2Utf8ToUnicode = String$(lBufferSize, Chr(0))lRet = MultiByteToWideChar(CP_UTF8, 0, VarPtr(Utf(0)), lLength, StrPtr(Utf8ToUnicode), lBufferSize)If lRet <> 0 Then Utf8ToUnicode = Left(Utf8ToUnicode, lRet)End IfEnd FunctionFunction UnicodeToUtf8(ByVal UCS As String) As Byte()Dim lLength As LongDim lBufferSize As LongDim lResult As LongDim abUTF8() As BytelLength = Len(UCS)If lLength = 0 Then Exit FunctionlBufferSize = lLength * 3 + 1ReDim abUTF8(lBufferSize - 1)lResult = WideCharToMultiByte(CP_UTF8, 0, StrPtr(UCS), lLength, abUTF8(0), lBufferSize, vbNullString, 0)If lResult <> 0 ThenlResult = lResult - 1ReDim Preserve abUTF8(lResult)UnicodeToUtf8 = abUTF8End IfEnd FunctionPrivate Sub Command1_Click()Dim byt() As Bytebyt = UnicodeToUtf8("測(cè)試")Debug.Print Hex(byt(0)) & Hex(byt(1)) & Hex(byt(2))Debug.Print Utf8ToUnicode(byt()) End Sub

'復(fù)制下面文件到模塊中'調(diào)用:Text1.Text = UTF8_Decode(UTF8Zfc)'注意:文件下載后直接轉(zhuǎn)換,不能做任何其他轉(zhuǎn)換(如strconv)。'***************模塊代碼********************'Utf8字符轉(zhuǎn)化成Unicode字符定義Public Declare Function MultiByteToWideChar Lib "kernel32" (ByVal CodePage As Long, ByVal dwFlags As Long, ByRef lpMultiByteStr As Any, ByVal cchMultiByte As Long, ByVal lpWideCharStr As Long, ByVal cchWideChar As Long) As LongPublic Const CP_UTF8 = 65001'獲得系統(tǒng)的類(lèi)型定義Private Declare Function GetVersionExA Lib "kernel32" (lpVersionInformation As OSVERSIONINFO) As IntegerPrivate Type OSVERSIONINFO dwOSVersionInfoSize As Long dwMajorVersion As Long dwMinorVersion As Long dwBuildNumber As Long dwPlatformId As Long szCSDVersion As String * 128End Type'獲得系統(tǒng)的類(lèi)型Public Function GetVersion() As String Dim osinfo As OSVERSIONINFO Dim retvalue As Integer osinfo.dwOSVersionInfoSize = 148 osinfo.szCSDVersion = Space$(128) retvalue = GetVersionExA(osinfo) With osinfo Select Case .dwPlatformId Case 1 Select Case .dwMinorVersion Case 0 GetVersion = "1Windows 95" Case 10 GetVersion = "1Windows 98" Case 90 GetVersion = "1Windows Mellinnium" End Select Case 2 Select Case .dwMajorVersion Case 3 GetVersion = "2Windows NT 3.51" Case 4 GetVersion = "2Windows NT 4.0" Case 5 If .dwMinorVersion = 0 Then GetVersion = "2Windows 2000" Else GetVersion = "2Windows XP" End If End Select Case Else GetVersion = "Failed" End Select End WithEnd Function'功能: 把Utf8字符轉(zhuǎn)化成Unicode字符Public Function UTF8_Decode(ByVal sUTF8 As String) As String Dim lngUtf8Size As Long Dim strBuffer As String Dim lngBufferSize As Long Dim lngResult As Long Dim bytUtf8() As Byte Dim n As Long If LenB(sUTF8) = 0 Then Exit Function If Left(GetVersion(), 1) = "2" Then On Error GoTo EndFunction 'bytUtf8 = StrConv(sUTF8, vbFromUnicode) bytUtf8 = sUTF8 lngUtf8Size = UBound(bytUtf8) + 1 On Error GoTo 0 'Set buffer for longest possible string i.e. each byte is 'ANSI, thus 1 unicode(2 bytes)for every utf-8 character. lngBufferSize = lngUtf8Size * 2 strBuffer = String$(lngBufferSize, vbNullChar) 'Translate using code page 65001(UTF-8) lngResult = MultiByteToWideChar(CP_UTF8, 0, bytUtf8(0), _ lngUtf8Size, StrPtr(strBuffer), lngBufferSize) 'Trim result to actual length If lngResult Then UTF8_Decode = Left(strBuffer, lngResult) End If Else Dim i As Long Dim TopIndex As Long Dim TwoBytes(1) As Byte Dim ThreeBytes(2) As Byte Dim AByte As Byte Dim TStr As String Dim BArray() As Byte 'Resume on error in case someone inputs text with accents 'that should have been encoded as UTF-8 On Error Resume Next TopIndex = LenB(sUTF8) ' Number of bytes equal TopIndex+1 If TopIndex = 0 Then Exit Function ' get out if there's nothing to convert 'BArray = StrConv(sUTF8, vbFromUnicode) BArray = sUTF8 i = 0 ' Initialise pointer TopIndex = TopIndex - 1 ' Iterate through the Byte Array Do While i <= TopIndex AByte = BArray(i) If AByte < &H80 Then ' Normal ANSI character - use it as is TStr = TStr & Chr$(AByte): i = i + 1 ' Increment byte array index ElseIf AByte >= &HE0 Then 'was = &HE1 Then ' Start of 3 byte UTF-8 group for a character ' Copy 3 byte to ThreeBytes ThreeBytes(0) = BArray(i): i = i + 1 ThreeBytes(1) = BArray(i): i = i + 1 ThreeBytes(2) = BArray(i): i = i + 1 ' Convert Byte array to UTF-16 then Unicode TStr = TStr & ChrW$((ThreeBytes(0) And &HF) * &H1000 + (ThreeBytes(1) And &H3F) * &H40 + (ThreeBytes(2) And &H3F)) ElseIf (AByte >= &HC2) And (AByte <= &HDB) Then ' Start of 2 byte UTF-8 group for a character TwoBytes(0) = BArray(i): i = i + 1 TwoBytes(1) = BArray(i): i = i + 1 ' Convert Byte array to UTF-16 then Unicode TStr = TStr & ChrW$((TwoBytes(0) And &H1F) * &H40 + (TwoBytes(1) And &H3F)) Else ' Normal ANSI character - use it as is TStr = TStr & Chr$(AByte): i = i + 1 ' Increment byte array index End If Loop UTF8_Decode = TStr ' Return the resultant string Erase BArray End IfEndFunction:End Function

為您推薦

新加坡留學(xué)的陪讀政策怎樣?

??6-16歲國(guó)內(nèi)中小學(xué)生,母親可陪讀并工作   新加坡是一個(gè)社會(huì)治安良好、犯罪率極低、環(huán)境優(yōu)雅的花園國(guó)家,也是非常適宜華人居住的國(guó)家。   新加坡留學(xué)生論壇表示:華人比率

2021-05-28 22:09

上海璇岳信息科技有限公司怎么樣?

上海璇岳信息科技有限公司是2017-05-19在上海市崇明縣注冊(cè)成立的有限責(zé)任公司(自然人投資或控股),注冊(cè)地址位于上海市崇明區(qū)陳家鎮(zhèn)瀛東村53號(hào)3幢897室(上海智慧島數(shù)據(jù)產(chǎn)業(yè)園)。

2021-05-28 22:09

我去新加坡留學(xué),母親陪讀可以找工作嗎?

可以的,但是陪讀是有條件限制的,只有進(jìn)入政府中小學(xué)而且低于16周歲,母親才可以申請(qǐng)陪讀,在陪讀的第二年可以申請(qǐng)打工~ 根據(jù)母親的學(xué)歷以及英語(yǔ)水平不同,工作也是不同的,當(dāng)然,薪水也

2021-05-28 22:01

_闈炲父鎶辨瓑,鎮(zhèn)ㄦ墍璁塊棶鐨勯〉闈笉瀛樺湪,璇鋒偍紜緗戝潃鏄惁姝g

_闈炲父鎶辨瓑,鎮(zhèn)ㄦ墍璁塊棶鐨勯〉闈笉瀛樺湪,璇鋒偍紜緗戝潃鏄惁姝g 30翻譯簡(jiǎn)體中文 3555556333333333333 天津市宇璇機(jī)電安裝有限公司怎么樣? 天津市宇璇機(jī)電安裝

2021-05-28 22:00

陪讀政策是什么樣的?

陪讀政策是新政府為了吸引優(yōu)秀的孩子到新加坡就讀而設(shè)立的,很多家長(zhǎng)認(rèn)為陪讀的學(xué)校僅局限于政府中小學(xué)及幼兒園; 專(zhuān)家解析:除政府中小學(xué)及幼兒園外,就讀ITE或初級(jí)學(xué)院以及某些國(guó)

2021-05-28 22:00

如何通過(guò)留學(xué)移民新加坡

想要移民新加坡的學(xué)生,不妨選擇去新加坡讀碩士,因?yàn)樾录悠麓T士畢業(yè)移民留學(xué)簽證成功幾率極高,一般情況下是100%的通過(guò)率。 學(xué)制較短。政府大學(xué)一般的授課型碩士學(xué)制為2年,多數(shù)中

2021-05-28 20:20

加載中...
主站蜘蛛池模板: 青龙| 淮安市| 安乡县| 庆云县| 新昌县| 博爱县| 濮阳市| 洛南县| 丰镇市| 巴彦县| 伊川县| 定襄县| 和田市| 阜宁县| 彰化市| 新泰市| 柳江县| 南乐县| 铜鼓县| 治多县| 淅川县| 洞口县| 西平县| 铁力市| 商城县| 建阳市| 海淀区| 密云县| 苍溪县| 南昌市| 扶余县| 安新县| 湄潭县| 松原市| 洪江市| 都匀市| 阿拉尔市| 阿拉善右旗| 和平县| 海南省| 廊坊市|