I parse a pdf file with Chinese text "我我我", using podofo-0.9.1 example
ContentParser.
The result is:
=============================================
<</Type/XRef/DecodeParms<</Columns 4/Predictor
12>>/Filter/FlateDecode/ID[<63EE8
B4DF319CC4C9DCB31874AAAFE26><076E0418816AAB40A6CA39C41BCE2178>]/Index[ 15
21]/In
fo 14 0 R/Length 67/Prev 46213/Root 16 0 R/Size 36/W[ 1 2 1]>>
Processing page 1... 1 Keyword: BT
2 Variant: /P
3 Variant: <<
/MCID 0
>>
4 Keyword: BDC
5 Variant: /CS0
6 Keyword: cs
7 Variant: 0
8 Keyword: scn
9 Variant: /C2_0
10 Variant: 1
11 Keyword: Tf
12 Variant: 10.560000
13 Variant: 0
14 Variant: 0
15 Variant: 10.560000
16 Variant: 90
17 Variant: 758.280000
18 Keyword: Tm
19 Variant: <184118411841>
20 Keyword: Tj
21 Variant: /TT0
22 Variant: 1
23 Keyword: Tf
24 Variant: 2.989000
25 Variant: 0
26 Keyword: Td
27 Variant: ( )
28 Keyword: Tj
29 Keyword: EMC
30 Keyword: ET
12 keywords, 18 variants - page ok
=============================================
I call "我我我" strToExtract.
The utf16be code of strToExtract is "FE FF 62 11 62 11 62 11", but as you
know "19 Variant: <184118411841>" above is correspond with strToExtract.
I don't know the correlation between "6211" and "1841".
I can get the correct characters szTest = "<FEFF621162116211>", using the
following code.
And "<184118411841>"is not what I want.
wstring func(char* szTest)
{
std::vector<char> m_vecBuffer;
// char szTest[]="FEFF621162116211";
for (int i = 0; i < sizeof(szTest); i++)
{
m_vecBuffer.push_back(szTest[i]);
}
PdfString string;
string.SetHexData( m_vecBuffer.size() ? &(m_vecBuffer[0]) : "",
m_vecBuffer.size(), NULL);
return string.GetStringW();
}
Sorry for my poor English!
Any suggestion?
Best Wishes!
------------------------------------------------------------------------------
All the data continuously generated in your IT infrastructure
contains a definitive record of customers, application performance,
security threats, fraudulent activity, and more. Splunk takes this
data and makes sense of it. IT sense. And common sense.
http://p.sf.net/sfu/splunk-novd2d
_______________________________________________
Podofo-users mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/podofo-users