If you have a look at the PDF spec, you'll see that all PDFs are required to start with the following:
%PDF-1.X Where X is the version number that pdf claims to support. This is usually followed by some high-order ascii characters so various binary-vs-text testers will treat the PDF like a binary stream. Maintaining accurate byte offsets is critical in PDF. --Mark Storer Senior Software Engineer Cardiff.com import legalese.Disclaimer; Disclaimer<Cardiff> DisCard = null; > -----Original Message----- > From: Ed M [mailto:[email protected]] > Sent: Thursday, July 15, 2010 7:58 AM > To: [email protected] > Subject: [iText-questions] Distinguishing a Text Input Stream from a > PDFInput Stream > > > I am reading legacy data "files" from a database stored as BLOBs. When > the > "files" were first loaded, the file type was not captured by the submitter > as an attribute, so I am left with the task of reconstituting these files > as > PDF output streams without knowing whether they were originally text files > or PDFs. The trick is to determine if the BLOB contains a text file or a > PDF file. Both of these are reconstituted into a PDF in different > fashion. > > Is there a way to determine what the file type (or mime type) of the input > stream is once I have the java.io.InputStream obtained using > oracle.sql.BLOB.getBinaryStream(1L)? Or can anyone suggest a better > method? > > Any assistance would be greatly appreciated. Thanks. > > sample snippet: > ------------------------------------- > public byte[] getBlob(oracle.sql.BLOB blob) throws java.sql.SQLException { > java.io.ByteArrayOutputStream outputStream = new > java.io.ByteArrayOutputStream(); > java.io.InputStream inputstream = null; > > // determine file type??? or catch InvalidPdfException??? > > try { > // IF blob file type is PDF... > inputstream = blob.getBinaryStream(1L); > int bytesRead = 0; > byte[] tmpArray = null; > > while((bytesRead = inputStream.read()) != -1) { > outputStream.write(bytesRead); } > > if(inputStream != null) inputStream.close(); > if(outputSTream != null) outputStream.close(); > tmpArray = outputStream.toByteArray(); > } > catch(com.itextpdf.text.exceptions.InvalidPdfException e) { > try { > //IF blob file type is TEXT... > java.io.BufferedInputStream bis = null; > java.io.DataInputStream dis = null; > com.itextpdf.text.Document document = new > com.itextpdf.text.Document(); > > inputstream = blob.getBinaryStream(1L); > bis = new java.io.BufferedInputStream(inputStream); > dis = new java.io.DataInputStream(bis); > com.itextpdf.text.pdf.PdfWriter writer = > com.itextpdf.text.pdf.PdfWriter.getInstance(document, outputStream); > document.open(); > while(dis,available() != 0) { document.add(new > com.itextpdf.text.Paragraph(dis.readLine())); } > > if(document != null) document.close(); > if(bis != null) bis.close(); > if(dis != null) dis.close(); > tmpArray = outputStream.toByteArray(); > } > catch(Exception e) { e.printStackTrace(); } > } > catch(Exception e) { e.printStackTrace(System.out); } > > return tmpArray; > } > ------------------------------------- > end sample snippet: > > -- > View this message in context: http://itext- > general.2136553.n4.nabble.com/Distinguishing-a-Text-Input-Stream-from-a- > PDF-Input-Stream-tp2290278p2290278.html > Sent from the iText - General mailing list archive at Nabble.com. > > ------------------------------------------------------------------------ -- > ---- > This SF.net email is sponsored by Sprint > What will you do first with EVO, the first 4G phone? > Visit sprint.com/first -- http://p.sf.net/sfu/sprint-com-first > _______________________________________________ > iText-questions mailing list > [email protected] > https://lists.sourceforge.net/lists/listinfo/itext-questions > > Buy the iText book: http://www.itextpdf.com/book/ > Check the site with examples before you ask questions: > http://www.1t3xt.info/examples/ > You can also search the keywords list: > http://1t3xt.info/tutorials/keywords/ > > > No virus found in this incoming message. > Checked by AVG - www.avg.com > Version: 9.0.830 / Virus Database: 271.1.1/3003 - Release Date: 07/14/10 > 11:36:00 ------------------------------------------------------------------------------ This SF.net email is sponsored by Sprint What will you do first with EVO, the first 4G phone? Visit sprint.com/first -- http://p.sf.net/sfu/sprint-com-first _______________________________________________ iText-questions mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/itext-questions Buy the iText book: http://www.itextpdf.com/book/ Check the site with examples before you ask questions: http://www.1t3xt.info/examples/ You can also search the keywords list: http://1t3xt.info/tutorials/keywords/
