Hi Federico, Thanks for the patch. I also added the test to the existing DateTest.
Unfortunately, it is failing some of the existing tests now (see updated patch and test failure output attached). I was reading through 3.8.3. Dates of the PDF specification, but could not figure out, if the new behaviour is correct or not. Especially, these dates are considered valid now, but where invalid before: D:2012010 D:20120 And for "D:20120530235959Z00'00'" we are getting a different time value now. Any ideas? Best regards, Dominik ominik@dominik-HP-Notebook: ~/Schreibtisch/Programming/podofo/code/podofo/build/test/unit$ ./podofo-test --test DateTest .F.F.Parse sample from pdf_reference_1_7.pdf Parse all fields set Parse set year Parse set year, month Parse set year, month, day Parse only year and timezone set Parse berlin <?xml version="1.0" encoding='ISO-8859-1' standalone='yes' ?> <TestRun> <FailedTests> <FailedTest id="1"> <Name>DateTest::testCreateDateFromString</Name> <FailureType>Assertion</FailureType> <Location> <File>/home/dominik/Desktop/Programming/podofo/code/podofo/test/unit/DateTest.cpp</File> <Line>43</Line> </Location> <Message>equality assertion failed - Expected: 0 - Actual : 1 - D:2012010 </Message> </FailedTest> <FailedTest id="2"> <Name>DateTest::testDateValue</Name> <FailureType>Assertion</FailureType> <Location> <File>/home/dominik/Desktop/Programming/podofo/code/podofo/test/unit/DateTest.cpp</File> <Line>79</Line> </Location> <Message>equality assertion failed - Expected: 1 - Actual : 0 - D:20120530235959Z00'00' </Message> </FailedTest> </FailedTests> <SuccessfulTests> <Test id="3"> <Name>DateTest::testAdditional</Name> </Test> </SuccessfulTests> <Statistics> <Tests>3</Tests> <FailuresTotal>2</FailuresTotal> <Errors>0</Errors> <Failures>2</Failures> </Statistics> </TestRun> On Sat, Dec 5, 2020 at 10:15 AM Federico Kircheis < federico.kirch...@gmail.com> wrote: > Hello, > > it seems that the class PdfDate does not comply to the standard when > parsing dates. > > pdf_reference_1_7.pdf > ( > https://www.adobe.com/content/dam/acom/en/devnet/acrobat/pdfs/pdf_reference_1-7.pdf) > > has as example "D:199812231952-08'00'" (3.8.3 Dates) as date, but > PdfDate does not parse it correctly. > > See the attached patch to make, PdfDate conformant. > > A possible test would look like > > ---- > struct name_date { > std::string name; > std::string date; > }; > > const name_date data[] = { > {"sample from pdf_reference_1_7.pdf", "D:199812231952-08'00'"}, > // UTC 1998-12-24 03:52:00 > {"all fields set", "D:20201223195200-08'00'"}, // UTC 2020-12-23 > 03:52:00 > {"set year", "D:2020"}, // UTC 2020-01-01 00:00:00 > {"set year, month", "D:202001"}, // UTC 2020-01-01 00:00:00 > {"set year, month, day", "D:20200101"}, // UTC 202001-01 00:00:00 > {"only year and timezone set", "D:2020-08'00'"}, // UTC > 2020-01-01 08:00:00 > {"berlin", "D:20200315120820+01'00'"}, // UTC 2020-03-15 11:08:20 > }; > > for (const auto& d : data) { > std::cout << "Parse " << d.name << "\n"; > assert(PoDoFo::PdfDate(d.date).IsValid()); > } > ---- > > but I was not sure where to put it. > _______________________________________________ > Podofo-users mailing list > Podofo-users@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/podofo-users >
Index: src/podofo/base/PdfDate.cpp =================================================================== --- src/podofo/base/PdfDate.cpp (Revision 2018) +++ src/podofo/base/PdfDate.cpp (Arbeitskopie) @@ -47,17 +47,14 @@ } PdfDate::PdfDate( const time_t & t ) - : m_bValid( false ) + : m_time( t ), m_bValid( false ) { - m_time = t; CreateStringRepresentation(); } PdfDate::PdfDate( const PdfString & sDate ) - : m_bValid( false ) + : m_time( -1 ), m_bValid( false ) { - m_time = -1; - if ( !sDate.IsValid() ) { m_szDate[0] = 0; @@ -66,11 +63,8 @@ strncpy(m_szDate,sDate.GetString(),PDF_DATE_BUFFER_SIZE); - struct tm _tm; - memset( &_tm, 0, sizeof(_tm) ); - int nZoneShift = 0; - int nZoneHour = 0; - int nZoneMin = 0; + struct tm _tm{}; + _tm.tm_mday = 1; const char * pszDate = sDate.GetString(); if ( pszDate == NULL ) return; @@ -79,51 +73,53 @@ if ( *pszDate++ != ':' ) return; } - if ( ParseFixLenNumber(pszDate,4,0,9999,_tm.tm_year) == false ) + // year is not optional + if ( !ParseFixLenNumber(pszDate,4,0,9999,_tm.tm_year) ) return; - _tm.tm_year -= 1900; - if ( *pszDate != '\0' ) { - if ( ParseFixLenNumber(pszDate,2,1,12,_tm.tm_mon) == false ) - return; + // all other values are optional, if not set they are 0-init (except mday) + if ( ParseFixLenNumber(pszDate,2,1,12,_tm.tm_mon) ) + { _tm.tm_mon--; - if ( *pszDate != '\0' ) { - if ( ParseFixLenNumber(pszDate,2,1,31,_tm.tm_mday) == false ) return; - if ( *pszDate != '\0' ) { - if ( ParseFixLenNumber(pszDate,2,0,23,_tm.tm_hour) == false ) return; - if ( *pszDate != '\0' ) { - if ( ParseFixLenNumber(pszDate,2,0,59,_tm.tm_min) == false ) return; - if ( *pszDate != '\0' ) { - if ( ParseFixLenNumber(pszDate,2,0,59,_tm.tm_sec) == false ) return; - if ( *pszDate != '\0' ) { - switch(*pszDate++) { - case '+': - nZoneShift = -1; - break; - case '-': - nZoneShift = 1; - break; - case 'Z': - nZoneShift = 0; - break; - default: - return; - } - if ( ParseFixLenNumber(pszDate,2,0,59,nZoneHour) == false ) return; - if ( *pszDate == '\'' ) { - pszDate++; - if ( ParseFixLenNumber(pszDate,2,0,59,nZoneMin) == false ) return; - if ( *pszDate != '\'' ) return; - pszDate++; - } - } - } - } + if ( ParseFixLenNumber(pszDate,2,1,31,_tm.tm_mday) ) + { + if ( ParseFixLenNumber(pszDate,2,0,23,_tm.tm_hour) ) + { + if ( ParseFixLenNumber(pszDate,2,0,59,_tm.tm_min) ) + ParseFixLenNumber(pszDate,2,0,59,_tm.tm_sec); } } } + // zone is optional + int nZoneShift = 0; + int nZoneHour = 0; + int nZoneMin = 0; + + if (*pszDate == 'Z') { + ++pszDate; + } else if (*pszDate != '\0') { + switch (*pszDate++) { + case '+': + nZoneShift = -1; + break; + case '-': + nZoneShift = 1; + break; + default: + return; + } + if ( !ParseFixLenNumber(pszDate,2,0,59,nZoneHour) ) return; + if (*pszDate == '\'') { + pszDate++; + if ( !ParseFixLenNumber(pszDate,2,0,59,nZoneMin) ) return; + if (*pszDate != '\'') + return; + pszDate++; + } + } + if ( *pszDate != '\0' ) { return; @@ -206,9 +202,9 @@ } -bool PdfDate::ParseFixLenNumber(const char *&in, unsigned int length, int min, int max, int &ret) +bool PdfDate::ParseFixLenNumber(const char *&in, unsigned int length, int min, int max, int &ret_) { - ret = 0; + int ret = 0; for(unsigned int i=0;i<length;i++) { if ( in == NULL || !isdigit(*in)) return false; @@ -216,6 +212,7 @@ in++; } if ( ret < min || ret > max ) return false; + ret_ = ret; return true; } Index: src/podofo/base/PdfDate.h =================================================================== --- src/podofo/base/PdfDate.h (Revision 2018) +++ src/podofo/base/PdfDate.h (Arbeitskopie) @@ -124,7 +124,7 @@ * \param length of number to read * \param min minimal value of number * \param max maximal value of number - * \param ret parsed number + * \param ret parsed number (updated only on success) */ bool ParseFixLenNumber(const char *&in, unsigned int length, int min, int max, int &ret); Index: test/unit/DateTest.cpp =================================================================== --- test/unit/DateTest.cpp (Revision 2018) +++ test/unit/DateTest.cpp (Arbeitskopie) @@ -30,8 +30,8 @@ { } -void DateTest::tearDown() -{ +void DateTest::tearDown(){ + } void checkExpected(const char *pszDate, bool bExpected) @@ -38,7 +38,14 @@ { PdfString tmp(pszDate); PdfDate date(tmp); - CPPUNIT_ASSERT_EQUAL(bExpected,date.IsValid()); + if( pszDate != NULL ) + { + CPPUNIT_ASSERT_EQUAL_MESSAGE(pszDate,bExpected,date.IsValid()); + } + else + { + CPPUNIT_ASSERT_EQUAL_MESSAGE("NULL",bExpected,date.IsValid()); + } } void DateTest::testCreateDateFromString() @@ -45,7 +52,7 @@ { checkExpected(NULL,false); checkExpected("D:2012",true); - checkExpected("D:20120",false); + checkExpected("D:20120",false); // checkExpected("D:201201",true); checkExpected("D:2012010",false); checkExpected("D:20120101",true); @@ -66,8 +73,10 @@ void DateTest::testDateValue() { - PdfDate date(PdfString("D:20120530235959Z00'00'")); - CPPUNIT_ASSERT_EQUAL(true,date.IsValid()); + const char* pszDate = "D:20120530235959Z00'00'"; + PdfString tmp(pszDate); + PdfDate date(tmp); + CPPUNIT_ASSERT_EQUAL_MESSAGE(std::string(pszDate),true,date.IsValid()); const time_t &time = date.GetTime(); struct tm _tm; memset (&_tm, 0, sizeof(struct tm)); @@ -81,4 +90,27 @@ CPPUNIT_ASSERT_EQUAL(true,time==time2); } +void DateTest::testAdditional() +{ + struct name_date { + std::string name; + std::string date; + }; + const name_date data[] = { + {"sample from pdf_reference_1_7.pdf", "D:199812231952-08'00'"}, + // UTC 1998-12-24 03:52:00 + {"all fields set", "D:20201223195200-08'00'"}, // UTC 2020-12-03:52:00 + {"set year", "D:2020"}, // UTC 2020-01-01 00:00:00 + {"set year, month", "D:202001"}, // UTC 2020-01-01 00:00:00 + {"set year, month, day", "D:20200101"}, // UTC 202001-01 00:00:00 + {"only year and timezone set", "D:2020-08'00'"}, // UTC 2020-01-01 08:00:00 + {"berlin", "D:20200315120820+01'00'"}, // UTC 2020-03-15 11:08:20 + }; + + for (const auto& d : data) { + std::cout << "Parse " << d.name << "\n"; + assert(PoDoFo::PdfDate(d.date).IsValid()); + } +} + Index: test/unit/DateTest.h =================================================================== --- test/unit/DateTest.h (Revision 2018) +++ test/unit/DateTest.h (Arbeitskopie) @@ -29,6 +29,7 @@ CPPUNIT_TEST_SUITE( DateTest ); CPPUNIT_TEST( testCreateDateFromString ); CPPUNIT_TEST( testDateValue ); + CPPUNIT_TEST( testAdditional ); CPPUNIT_TEST_SUITE_END(); public: void setUp(); @@ -36,6 +37,8 @@ void testCreateDateFromString(); void testDateValue(); + void testAdditional(); + }; #endif
_______________________________________________ Podofo-users mailing list Podofo-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/podofo-users