Re: [SOLVED] Re: Creating PDF/A from LaTeX source and from existing PDF

2024-07-10 Thread Max Nikulin

On 10/07/2024 15:37, Ceppo wrote:

but I couldn't build a working gs command.

[...]

[1]: https://github.com/qpdf/qpdf/issues/85


There is a link to gs arguments





[SOLVED] Re: Creating PDF/A from LaTeX source and from existing PDF

2024-07-10 Thread Ceppo
On Mon, Jul 08, 2024 at 05:20:57PM GMT, Jeffrey Walton wrote:
> The pdf-linter I use to verify a pdf document is qpdf,
> . It is available on most distributions,
> including Debian, Fedora and Red Hat.
>
> The command to check the document is `qpdf --check `.

This command doesn't show me any info abouy PDF/A compliance. man says it
"merely checks that the PDF file is syntactically valid".

> > I will also probably have to upload under the same requirement some
> > third-party PDF, which is not PDF/A, without access to an editable version.
> > Is there a way to convert them to PDF/A? I know that converting from an
> > editable version would be the correct way for this, but I have no real way
> > to get it.
>
> qpdf may provide this functionality, but I have never used it.

[1] says PDF/A conversion is out of scope for the library. However, [2] pointed
me to ocrmypdf and this command produces a valid PDF/A-1b file:

ocrmypdf --output-type pdfa-1 --tesseract-timeout=0 --skip-text \
input.pdf output.pdf

Another comment pointed out this relies on ghostscript, but I couldn't build a
working gs command. I will try harder as soon as I have some free time. Anyway
I have my conversion tool now, and I'm happy with it.

As a short summary of this thread outcome, I can:

- compile with `pdflatex` as usual
- convert to PDF/A with the `ocrmypdf` command above (probably not the most
  clean way, but it works)
- validate with veraPDF

Thanks everyone for your help, it was higly appreciated even when it didn't
work as expected!


[1]: https://github.com/qpdf/qpdf/issues/85
[2]: https://github.com/qpdf/qpdf/issues/85#issuecomment-1278055568


--
Ceppo


signature.asc
Description: PGP signature


Re: Creating PDF/A from LaTeX source and from existing PDF

2024-07-08 Thread Jeffrey Walton
On Mon, Jul 8, 2024 at 5:56 PM Ceppo  wrote:
>
> On Wed, Jul 03, 2024 at 06:38:51PM GMT, Richard wrote:
> > From LaTeX, this is quite simple, there's a package for that - as for pretty
> > much everything in the LaTeX world. Googling for just like 10 sec could have
> > given you this great guide: https://webpages.tuni.fi/latex/pdfa-guide.pdf
>
> I did my research and found the document you linked. In fact it's what pointed
> me to the pdfx LaTeX package, but I couldn't make it work. I acknowledge I
> missed its reference to veraPDF, though.
>
> > gs -dQUIET -dUseCIEColor -sProcessColorModel=DeviceCMYK -sDEVICE=pdfwrite
> > -dPDFACompatibilityPolicy=1 -dCompressFonts=true -dSubsetFonts=true
> > -sFONTPATH=/usr/share/fonts/ -o  
>
> The output isn't accepted by veraPDF, either. I will try to understand
> something more about ghostscript.

Have a look at . It discusses
some of the finer points of PDF/A conversion in the comments, like
color spaces.

Jeff



Re: Creating PDF/A from LaTeX source and from existing PDF

2024-07-08 Thread Jeffrey Walton
On Wed, Jul 3, 2024 at 12:13 PM Ceppo  wrote:
>
> I wrote a report with LaTeX, and afterwards discovered it must be
> PDF/A-compliant - which wasn't. I found the pdfx LaTeX package and followed 
> its
> instructions, thus obtaining a file that should be PDF/A and pdfinfo 
> identifies
> as such, but my employer's upload form thinks isn't. Is pdfinfo reliable 
> enough
> that I can tell my employer his form is broken? If not, how can I make sure
> that pdflatex's output is actually PDF/A-compliant?

The pdf-linter I use to verify a pdf document is qpdf,
. It is available on most distributions,
including Debian, Fedora and Red Hat.

The command to check the document is `qpdf --check `.

> I will also probably have to upload under the same requirement some 
> third-party
> PDF, which is not PDF/A, without access to an editable version. Is there a way
> to convert them to PDF/A? I know that converting from an editable version 
> would
> be the correct way for this, but I have no real way to get it.

qpdf may provide this functionality, but I have never used it. From
the project's description: "qpdf is a command-line tool and C++
library that performs content-preserving transformations on PDF files.
It supports linearization, encryption, and numerous other features. It
can also be used for splitting and merging files, creating PDF files
(but you have to supply all the content yourself), and inspecting
files for study or analysis."

Another tool I would look at is GhostScript. It looks like it can
convert to PDF/A: .

> A requirement of any solution is that it doesn't rely on non-DFSG-compliant
> software, including online conversion tools.

Jeff



Re: Creating PDF/A from LaTeX source and from existing PDF

2024-07-08 Thread Ceppo
On Wed, Jul 03, 2024 at 11:15:51AM GMT, Henning Follmann wrote:
> On Wed, Jul 03, 2024 at 01:06:56PM +, Ceppo wrote:
> > A requirement of any solution is that it doesn't rely on non-DFSG-compliant
> > software, including online conversion tools.
>
> Please looks at this thread at StackExchange. I found that to be very
> helpful.
> https://tex.stackexchange.com/questions/130201/pdf-a-with-hyperref-on-tex-live-2013/136653#136653
>
> Please let me know how it works out for you.

Hello.
Thanks for pointing to the thread, but the solution isn't suitable for me. I
need a solution that does not rely on non-DFSG-compliant software, but the
first step requires to use a file from a zip archive [1] with a license that
explicitly forbids to modify and sell it.


[1]: http://www.eci.org/_media/downloads/icc_profiles_from_eci/ecirgbv20.zip


--
Ceppo


signature.asc
Description: PGP signature


Re: Creating PDF/A from LaTeX source and from existing PDF

2024-07-08 Thread Ceppo
On Wed, Jul 03, 2024 at 03:36:17PM GMT, to...@tuxteam.de wrote:
> On Wed, Jul 03, 2024 at 01:06:56PM +, Ceppo wrote:
> > I wrote a report with LaTeX, and afterwards discovered it must be
> > PDF/A-compliant - which wasn't. I found the pdfx LaTeX package and followed
> > its instructions, thus obtaining a file that should be PDF/A and pdfinfo
> > identifies as such, but my employer's upload form thinks isn't [...]
>
> Uh-oh. We set the standards, but won't tell you what they are.

Well, in fact they did tell - they just did *after* I produced my report. But
yes, the workflow is very broken...

> Not concrete help, but the Wikipedia [1] makes for an interesting
> read (including refs to bunches of test suites you can throw at your
> publisher's site to find out where their validator is failing).

I read about Isartor Test Suite, but [1] says it checks if the validator
accepts non-compliant files, not if it rejects compliant files.

> And there seems to be a kind of semi-official validaror, according
> to the above ref.

I guess you mean veraPDF?


[1]: https://pdfa.org/resource/isartor-test-suite/


--
Ceppo


signature.asc
Description: PGP signature


Re: Creating PDF/A from LaTeX source and from existing PDF

2024-07-08 Thread Ceppo
On Wed, Jul 03, 2024 at 06:38:51PM GMT, Richard wrote:
> From LaTeX, this is quite simple, there's a package for that - as for pretty
> much everything in the LaTeX world. Googling for just like 10 sec could have
> given you this great guide: https://webpages.tuni.fi/latex/pdfa-guide.pdf

I did my research and found the document you linked. In fact it's what pointed
me to the pdfx LaTeX package, but I couldn't make it work. I acknowledge I
missed its reference to veraPDF, though.

> gs -dQUIET -dUseCIEColor -sProcessColorModel=DeviceCMYK -sDEVICE=pdfwrite
> -dPDFACompatibilityPolicy=1 -dCompressFonts=true -dSubsetFonts=true
> -sFONTPATH=/usr/share/fonts/ -o  

The output isn't accepted by veraPDF, either. I will try to understand
something more about ghostscript.


--
Ceppo


signature.asc
Description: PGP signature


Re: Creating PDF/A from LaTeX source and from existing PDF

2024-07-08 Thread Ceppo

On Wed, Jul 03, 2024 at 10:52:06PM GMT, y...@vienna.at wrote:
> Well, that is my way:

Thanks for providing your script. I tried it with one tweak:

> latex  .../Nix.tex  .../Nix.dvi
> dvips -o Nix.ps  Nix.pdf
   ^^^
I guess here you meant Nix.dvi...

> ps2pdf ... Nix.ps ... Nix.pdf
> chmod 755 script
> All works since many many years absolutly perfect, nothing else ever was is
> needed

However, the resulting PDF is not recognized as PDF/A by veraPDF. Have you
tested it with something else?


--
Ceppo


signature.asc
Description: PGP signature


Re: Creating PDF/A from LaTeX source and from existing PDF

2024-07-08 Thread Ceppo
On Wed, Jul 03, 2024 at 10:18:01AM GMT, Sarunas Burdulis wrote:
> pdfinfo probably only reads metadata, but does not do any PDF/A compliance
> validation.
>
> VeraPDF seems to work for validation (https://verapdf.org/software/).

I don't know about pdfinfo, but it looks like veraPDF at least agrees with my
contractor's form. Thanks for pointing me to it, it looks like now I have a
tool to check if my document is compliant.


--
Ceppo


signature.asc
Description: PGP signature


Re: Creating PDF/A from LaTeX source and from existing PDF

2024-07-04 Thread Richard
The first bit is just a warning, not an error. Of course, you could check
what has changed in v9.11 that makes this not recommended anymore. Maybe
they already handle it internally when you set -dPDFACompatibilityPolicy=1
and the old setting can interfere. But when the output of the validator
doesn't change, it's probably just meant as you don't need to specify this
anymore, we activate it ourselves.

Speaking of the validator, those look more like warnings too and not like
deal breakers. In the end, only you know what your contractor expects of
you. And if they don't even bother inspecting the result, this will be
irrelevant. After all, the only reason PDF/A exists is for archiving
reasons. It pretty much just throws out all the proprietary clutter from
the PDF standard. The important thing is that fonts are embedded to always
be able to display them correctly, and that it's specified how images and
other media are embedded. If your contractor expects more of you, they
should pay for the appropriate software.

Richard

PS: this isn't really meant for this, but you could install Scribus and try
to import the PDF there. It also has a validator similar to Adobes
Preflight. Maybe it can give you a more precise result. I'm not sure if it
even can output PDF/A, I only know that it does PDF/X, but maybe it can
even be used for better conversion to PDF/A. The last time I tried to
import a large PDF into Scribus it got kinda stuck, but it has evolved
since then and maybe it was a hardware limitation.

On Thu, Jul 4, 2024 at 4:38 AM Greg Marks  wrote:

> $gs -dQUIET -dUseCIEColor -sProcessColorModel=DeviceCMYK -sDEVICE=pdfwrite 
> -dPDFACompatibilityPolicy=1
> -dCompressFonts=true -dSubsetFonts=true -sFONTPATH=/usr/share/fonts/ -o
> new.pdf old.pdf
>
> [Gives warnings:
>
>GPL Ghostscript 10.00.0:
>
>Use of -dUseCIEColor detected!
>Since the release of version 9.11 of Ghostscript we recommend you do
> not set
>-dUseCIEColor with the pdfwrite/ps2write device family.]
>
> Uploading new.pdf to https://www.pdfforge.org/online/en/validate-pdfa
> produces report "The file is not a valid PDF/A document" with these
> details:
>
>ISO 19005-1:2005
>6.1.8
>The object number and generation number shall be separated
>by a single white-space character. The generation number
>and obj keyword shall be separated by a single white-space
>character. The object number and endobj keyword shall each be
>preceded by an EOL marker. The obj and endobj keywords shall
>each be followed by an EOL marker.
>
>ISO 19005-1:2005
>6.1.7
>The stream keyword shall be followed either by a CARRIAGE RETURN
>(0Dh) and LINE FEED (0Ah) character sequence or by a single
>LINE FEED character. The endstream keyword shall be preceded
>by an EOL marker
>
> Repeating with the flag -dUseCIEColor removed prevents the Ghostscript
> warnings but doesn't change the PDF/A validation result.
>
> Best regards,
> Greg Marks
>


Re: Creating PDF/A from LaTeX source and from existing PDF

2024-07-03 Thread jeremy ardley



On 4/7/24 11:10, Stefan Monnier wrote:

This might qualify as a bug in your MUA (it can make sense to require
a small font for some parts of the message, but it seems this style
applies to the whole message, which makes no sense), tho maybe it's due
to some particularity of your configuration, or of the way you use your
MUA's editor.



I use thunderbird and I usually remember to select Sending Format as 
text only when sending to lists. This should  always render correctly on 
any mua





Re: Creating PDF/A from LaTeX source and from existing PDF

2024-07-03 Thread Stefan Monnier
Hi Richard,

I don't see any problem because I'm reading this mailing-list from a MUA
that's mostly text-only and doesn't try to use variable-size fonts, but
looking at the HTML you send I see:

>  style="font-family:arial,helvetica,sans-serif;font-size:small">

repeated several times.  I have no idea why your MUA puts it there, but
I suspect that's the reason some of the readers here find your email's
messages to be hard to read: your mail specifically asks for
`font-size:small`.

This might qualify as a bug in your MUA (it can make sense to require
a small font for some parts of the message, but it seems this style
applies to the whole message, which makes no sense), tho maybe it's due
to some particularity of your configuration, or of the way you use your
MUA's editor.


Stefan



small font (was: Re: Creating PDF/A from LaTeX source and from existing PDF)

2024-07-03 Thread Max Nikulin

I am in doubts what is more rude:

On 04/07/2024 04:02, Richard wrote:

Please stop using such a dinky font. There are plenty of old farts
trying to read this list.


- writing this before an attempt to hijack the thread using an already 
discussed question,


Tell that to your mail program. If it chooses to show you the mail that 
way, don't blame me.


- insisting on an "industry standard" mail style


Tell that to your mail progra=


---^^^




Re: Creating PDF/A from LaTeX source and from existing PDF

2024-07-03 Thread Max Nikulin

On 04/07/2024 04:49, Greg Marks wrote:


$gs -dQUIET -dUseCIEColor -sProcessColorModel=DeviceCMYK -sDEVICE=pdfwrite 
-dPDFACompatibilityPolicy=1 -dCompressFonts=true -dSubsetFonts=true 
-sFONTPATH=/usr/share/fonts/ -o new.pdf old.pdf

[...]

The object number and generation number shall be separated
by a single white-space character. The generation number

[...]

The stream keyword shall be followed either by a CARRIAGE RETURN


I expect that pdftk and qpdf have their own serializers. I have no idea 
if they can transform a file to a PDF/A compliant document, but they 
might use proper separators.


Perhaps LaTeX documents require some tuning (metadata blocks, etc.). If 
you use pdflatex then I would try lualatex.





Re: Creating PDF/A from LaTeX source and from existing PDF

2024-07-03 Thread Greg Marks
> Now, for just random PDFs, this is a bit more tricky, but you can do so
> with ghostscript. Now, this sadly doesn't have such a great guide, but
> something like this should do the trick, though that's only PDF/A-1 for all
> I can tell. If your contractor needs a different version, you'll have to
> adapt it:
> 
> gs -dQUIET -dUseCIEColor -sProcessColorModel=DeviceCMYK -sDEVICE=pdfwrite
> -dPDFACompatibilityPolicy=1 -dCompressFonts=true -dSubsetFonts=true
> -sFONTPATH=/usr/share/fonts/ -o  

This does not seem to work.  For example:

$cd /tmp

$wget -O old.pdf https://arxiv.org/pdf/2406.18499

$gs -dQUIET -dUseCIEColor -sProcessColorModel=DeviceCMYK -sDEVICE=pdfwrite 
-dPDFACompatibilityPolicy=1 -dCompressFonts=true -dSubsetFonts=true 
-sFONTPATH=/usr/share/fonts/ -o new.pdf old.pdf

[Gives warnings: 

   GPL Ghostscript 10.00.0:

   Use of -dUseCIEColor detected!
   Since the release of version 9.11 of Ghostscript we recommend you do not set
   -dUseCIEColor with the pdfwrite/ps2write device family.]

Uploading new.pdf to https://www.pdfforge.org/online/en/validate-pdfa
produces report "The file is not a valid PDF/A document" with these
details:

   ISO 19005-1:2005
   6.1.8
   The object number and generation number shall be separated
   by a single white-space character. The generation number
   and obj keyword shall be separated by a single white-space
   character. The object number and endobj keyword shall each be
   preceded by an EOL marker. The obj and endobj keywords shall
   each be followed by an EOL marker.

   ISO 19005-1:2005
   6.1.7
   The stream keyword shall be followed either by a CARRIAGE RETURN
   (0Dh) and LINE FEED (0Ah) character sequence or by a single
   LINE FEED character. The endstream keyword shall be preceded
   by an EOL marker

Repeating with the flag -dUseCIEColor removed prevents the Ghostscript
warnings but doesn't change the PDF/A validation result.

Best regards,
Greg Marks


signature.asc
Description: PGP signature


Re: Creating PDF/A from LaTeX source and from existing PDF

2024-07-03 Thread Richard
Well, guess what, I haven't done anything to change the way messages look.
The only settings I ever change is how they are displayed to me. And never
has anyone ever had an issue with that, in many years. Probably because
other people are just not using unusable software. And quite frankly,
punishing the ignorant for their ignorance is the best policy there is. If
you go out of your way to make life as difficult as possible for yourself,
that's your issue. Don't make it everybody else's issue. Quit whining and
learn how to search the internet for solutions first. The chance that
you're the first to ask such basic questions is pretty much not existent.
And if you refuse to learn, that's up to you. But then you didn't learn
live's biggest lesson, you never stop learning.

And with that I'm ending this ridiculous discussions, this has gone far
enough off-topic.

Best

On Wed, Jul 3, 2024 at 11:12 PM Greg Wooledge  wrote:

> That said, I wonder *why* you would go out of your way to make your
> messages harder to read for people who don't know how to activate every
> single feature of their MUA.
>


Re: Creating PDF/A from LaTeX source and from existing PDF

2024-07-03 Thread Greg Wooledge
On Wed, Jul 03, 2024 at 23:02:16 +0200, Richard wrote:
> >
> > Please stop using such a dinky font. There are plenty of old farts trying
> > to read this list.
> 
> Tell that to your mail program. If it chooses to show you the mail that
> way, don't blame me. Everything needed to display it any way you want is
> there, it just needs to be used. Thunderbird can define a minimum text size
> and refuse messages to use their own font. If your archaic software doesn't
> do basics, blame the dev - or better yet yourself, as the choice is yours.

I never saw any problem, as my terminal-based MUA renders your text/plain
part just fine.  I didn't even know you were posting multi-part messages
until someone complained about the font size.

That said, I wonder *why* you would go out of your way to make your
messages harder to read for people who don't know how to activate every
single feature of their MUA.  It would be a good policy to make your
messages as easy to read as possible, for as many people as possible,
by default.

If you're simply punishing the ignorant for their ignorance, well, that
seems a bit spiteful.



Re: Creating PDF/A from LaTeX source and from existing PDF

2024-07-03 Thread Richard
>
> Please stop using such a dinky font. There are plenty of old farts trying
> to read this list.

Tell that to your mail program. If it chooses to show you the mail that
way, don't blame me. Everything needed to display it any way you want is
there, it just needs to be used. Thunderbird can define a minimum text size
and refuse messages to use their own font. If your archaic software doesn't
do basics, blame the dev - or better yet yourself, as the choice is yours.

And the other part is its own thread. I've commented everything I know.
Just redo your book as native ePub, there's no way around it. If you want
to find shortcuts, you'll have to do your own research, even beyond Linux
there probably is no piece of software that can do what you are looking
for. But on the other hand, I'd never have expected ghostscript - or to be
more precise GhostPDL, if I'm not mistaken - to be able to handle
Microsoft's rubbish XPS format and convert that to a proper PDF. So who
knows? Instead of going on other people's nerves with an unsolvable issue,
put those questions into the search machine of your choice. Maybe it will
be more competent than your mail program.

Richard

Am Mi., 3. Juli 2024 um 21:20 Uhr schrieb Van Snyder <
van.sny...@sbcglobal.net>:

> On Wed, 2024-07-03 at 18:38 +0200, Richard wrote:
>
> For anything further, you'll have to research yourself as ghostscript is
> very complex but used by many people.
>
>
> Please stop using such a dinky font. There are plenty of old farts trying
> to read this list.
>
>
> Can ghostscript convert a PDF generated by pdflatex to ePub or mobi?
>
> Calibre made a mess, especially of tables. E-mailing it to my Kindle
> account with "convert" in the subject line made a mess. Tools to convert
> LaTeX to html in the hope of ultimately getting to ePub or mobi utterly
> failed, so I don't know whether they in the end would have made a mess.
>
>


Re: Creating PDF/A from LaTeX source and from existing PDF

2024-07-03 Thread Van Snyder
On Wed, 2024-07-03 at 15:31 -0400, e...@gmx.us wrote:
> On 7/3/24 15:20, Van Snyder wrote:
> > On Wed, 2024-07-03 at 18:38 +0200, Richard wrote:
> > > For anything further, you'll have to research yourself as
> > > ghostscript
> > > is very complex but used by many people.
> > 
> > Please stop using such a dinky font.
> 
> That's what ctrl-shift-+ is for.

Yeah, those of us who have been at this for a decade or two know that.
But it makes everything else so large that it doesn't fit anymore, even
at full screen, on my laptop.




Re: Creating PDF/A from LaTeX source and from existing PDF

2024-07-03 Thread eben

On 7/3/24 15:20, Van Snyder wrote:

On Wed, 2024-07-03 at 18:38 +0200, Richard wrote:

For anything further, you'll have to research yourself as ghostscript
is very complex but used by many people.


Please stop using such a dinky font.


That's what ctrl-shift-+ is for.




Re: Creating PDF/A from LaTeX source and from existing PDF

2024-07-03 Thread Van Snyder
On Wed, 2024-07-03 at 18:38 +0200, Richard wrote:
> For anything further, you'll have to research yourself as ghostscript
> is very complex but used by many people.

Please stop using such a dinky font. There are plenty of old farts
trying to read this list.


Can ghostscript convert a PDF generated by pdflatex to ePub or mobi?

Calibre made a mess, especially of tables. E-mailing it to my Kindle
account with "convert" in the subject line made a mess. Tools to
convert LaTeX to html in the hope of ultimately getting to ePub or mobi
utterly failed, so I don't know whether they in the end would have made
a mess.



Re: Creating PDF/A from LaTeX source and from existing PDF

2024-07-03 Thread Richard
>From LaTeX, this is quite simple, there's a package for that - as for
pretty much everything in the LaTeX world. Googling for just like 10 sec
could have given you this great guide:
https://webpages.tuni.fi/latex/pdfa-guide.pdf

Now, for just random PDFs, this is a bit more tricky, but you can do so
with ghostscript. Now, this sadly doesn't have such a great guide, but
something like this should do the trick, though that's only PDF/A-1 for all
I can tell. If your contractor needs a different version, you'll have to
adapt it:

gs -dQUIET -dUseCIEColor -sProcessColorModel=DeviceCMYK -sDEVICE=pdfwrite
-dPDFACompatibilityPolicy=1 -dCompressFonts=true -dSubsetFonts=true
-sFONTPATH=/usr/share/fonts/ -o  

Now, one common thing that can happen is that you don't have the necessary
fonts installed (I'm using the system-wide fonts path here, but you can
also set any other path) so the result would look off. In that case, you
could just convert the fonts into outlines, which will make text
machine-unreadable and the file much bigger. For that,
replavce "-dCompressFonts=true -dSubsetFonts=true
-sFONTPATH=/usr/share/fonts/" with "-dNoOutputFonts". Since I'm not
completely certain about ghostscripts defaults, you can also add
"-dDownsampleMonoImages=false -dDownsampleGrayImages=false
-dDownsampleColorImages=false" to make sure the images stay otherwise
unchanged.

For anything further, you'll have to research yourself as ghostscript is
very complex but used by many people.

Best
Richard


Re: Creating PDF/A from LaTeX source and from existing PDF

2024-07-03 Thread tomas
On Wed, Jul 03, 2024 at 11:05:59AM -0400, Henning Follmann wrote:
> On Wed, Jul 03, 2024 at 03:36:17PM +0200, to...@tuxteam.de wrote:

[...]

> > Uh-oh. We set the standards, but won't tell you what they are.
> 
> But they did! They say PDF/A. But you have a point that this maybe is
> not enough. Which version of PDF/A are we talking about?

Don't get me wrong. The idea of PDF/A is great, the idea of using it
is too... but judging by the Wikipedia entry, the actual implementation
seems to be a mess, with several "levels", one semi-official validator
and a whole bunch of pairwise incompatible validators.

So just specifying PDF/A sounds like a sadistic torture coming out of
Catbert's Evil Human Resources Department :-)

Cheers
-- 
t


signature.asc
Description: PGP signature


Re: Creating PDF/A from LaTeX source and from existing PDF

2024-07-03 Thread Henning Follmann
On Wed, Jul 03, 2024 at 01:06:56PM +, Ceppo wrote:
> I wrote a report with LaTeX, and afterwards discovered it must be
> PDF/A-compliant - which wasn't. I found the pdfx LaTeX package and followed 
> its
> instructions, thus obtaining a file that should be PDF/A and pdfinfo 
> identifies
> as such, but my employer's upload form thinks isn't. Is pdfinfo reliable 
> enough
> that I can tell my employer his form is broken? If not, how can I make sure
> that pdflatex's output is actually PDF/A-compliant?
> 
> I will also probably have to upload under the same requirement some 
> third-party
> PDF, which is not PDF/A, without access to an editable version. Is there a way
> to convert them to PDF/A? I know that converting from an editable version 
> would
> be the correct way for this, but I have no real way to get it.
> 
> A requirement of any solution is that it doesn't rely on non-DFSG-compliant
> software, including online conversion tools.
> 
> Thanks for any help.
> 

I did research a bit. It is possible to create a PDF/A compliant
document from LaTeX. It looks like you have to do some work though.

Please looks at this thread at StackExchange. I found that to be very
helpful.
https://tex.stackexchange.com/questions/130201/pdf-a-with-hyperref-on-tex-live-2013/136653#136653

Please let me know how it works out for you.

-H


-- 
Henning Follmann   | hfollm...@itcfollmann.com



Re: Creating PDF/A from LaTeX source and from existing PDF

2024-07-03 Thread Henning Follmann
On Wed, Jul 03, 2024 at 03:36:17PM +0200, to...@tuxteam.de wrote:
> On Wed, Jul 03, 2024 at 01:06:56PM +, Ceppo wrote:
> > I wrote a report with LaTeX, and afterwards discovered it must be
> > PDF/A-compliant - which wasn't. I found the pdfx LaTeX package and followed 
> > its
> > instructions, thus obtaining a file that should be PDF/A and pdfinfo 
> > identifies
> > as such, but my employer's upload form thinks isn't [...]
> 
> Uh-oh. We set the standards, but won't tell you what they are.

But they did! They say PDF/A. But you have a point that this maybe is
not enough. Which version of PDF/A are we talking about?

In general the policy is most likely a good one, because PDF/A gives you
certain guarantees (e.g. That the document renders consistently to the
same printed output, even years after archiving).

> 
> > Thanks for any help.
> 
> Not concrete help, but the Wikipedia [1] makes for an interesting
> read (including refs to bunches of test suites you can throw at your
> publisher's site to find out where their validator is failing).
> 
> And there seems to be a kind of semi-official validaror, according
> to the above ref.

I never tried to generate PDF/A from LaTeX but I am sure it is possible.
By default it would not include any javascript and IIRC it embeds the
font.

> 
> Cheers
> 
> [1] https://en.wikipedia.org/wiki/PDF/A
> -- 
> t



-- 
Henning Follmann   | hfollm...@itcfollmann.com



Re: Creating PDF/A from LaTeX source and from existing PDF

2024-07-03 Thread Sarunas Burdulis

On 7/3/24 09:06, Ceppo wrote:

I wrote a report with LaTeX, and afterwards discovered it must be
PDF/A-compliant - which wasn't. I found the pdfx LaTeX package and followed its
instructions, thus obtaining a file that should be PDF/A and pdfinfo identifies
as such, but my employer's upload form thinks isn't. Is pdfinfo reliable enough
that I can tell my employer his form is broken? If not, how can I make sure
that pdflatex's output is actually PDF/A-compliant?


pdfinfo probably only reads metadata, but does not do any PDF/A 
compliance validation.


VeraPDF seems to work for validation (https://verapdf.org/software/).

--
Sarunas Burdulis
Dartmouth Mathematics
math.dartmouth.edu/~sarunas

· https://useplaintext.email ·



OpenPGP_signature.asc
Description: OpenPGP digital signature


Re: Creating PDF/A from LaTeX source and from existing PDF

2024-07-03 Thread tomas
On Wed, Jul 03, 2024 at 01:06:56PM +, Ceppo wrote:
> I wrote a report with LaTeX, and afterwards discovered it must be
> PDF/A-compliant - which wasn't. I found the pdfx LaTeX package and followed 
> its
> instructions, thus obtaining a file that should be PDF/A and pdfinfo 
> identifies
> as such, but my employer's upload form thinks isn't [...]

Uh-oh. We set the standards, but won't tell you what they are.

> Thanks for any help.

Not concrete help, but the Wikipedia [1] makes for an interesting
read (including refs to bunches of test suites you can throw at your
publisher's site to find out where their validator is failing).

And there seems to be a kind of semi-official validaror, according
to the above ref.

Cheers

[1] https://en.wikipedia.org/wiki/PDF/A
-- 
t


signature.asc
Description: PGP signature


Creating PDF/A from LaTeX source and from existing PDF

2024-07-03 Thread Ceppo
I wrote a report with LaTeX, and afterwards discovered it must be
PDF/A-compliant - which wasn't. I found the pdfx LaTeX package and followed its
instructions, thus obtaining a file that should be PDF/A and pdfinfo identifies
as such, but my employer's upload form thinks isn't. Is pdfinfo reliable enough
that I can tell my employer his form is broken? If not, how can I make sure
that pdflatex's output is actually PDF/A-compliant?

I will also probably have to upload under the same requirement some third-party
PDF, which is not PDF/A, without access to an editable version. Is there a way
to convert them to PDF/A? I know that converting from an editable version would
be the correct way for this, but I have no real way to get it.

A requirement of any solution is that it doesn't rely on non-DFSG-compliant
software, including online conversion tools.

Thanks for any help.


--
Ceppo


signature.asc
Description: PGP signature