Hi, I don't think what you want to do is easy nor feasible... Poppler uses an heuristic and when using layout he tries to preserve the physical layout. While poppler does detect columns, I don't think it detects tables (and all the code for that is in the TextOutputDev.cc file). That being said, if i run your file through my pdftotext (0.24.3, dont know if that matters) then I DO get the arrows, so you could use the arrows to postprocess your file and add the lines you want.
Greetings José On Mon, Dec 2, 2013 at 12:21 PM, Nishanth Lawrence <[email protected] > wrote: > Hi , > Sorry my previous mail was not formatted correctly due to tables , so I > have given links to google docs . > > I am using pdftotext version 0.24.2 . Following is my case > > > https://drive.google.com/file/d/0Bwj-LRZNYWXvTXVZNHNyQnNNd00/edit?usp=sharing > > While extracting using the following command line utility > > pdftotext table.pdf table.txt -layout -nopgbrk -q > > I am getting the following output > > https://docs.google.com/file/d/0Bwj-LRZNYWXvSGdwa2FXemtydDQ/edit > > So what I want is , if there in no bullet in any of the line then there > should be empty line in opposite column , could you please tell me what to > change in the code so that I could get an output similar to this > > https://docs.google.com/file/d/0Bwj-LRZNYWXvck9jMmQtWFU1VkU/edit > > Or at least which part of the code has to be modified to achieve the > above . > > Thanks in advance > > > > -- > With Regards > Nishanth R Lawrence > > _______________________________________________ > poppler mailing list > [email protected] > http://lists.freedesktop.org/mailman/listinfo/poppler > >
_______________________________________________ poppler mailing list [email protected] http://lists.freedesktop.org/mailman/listinfo/poppler
