[ https://issues.apache.org/jira/browse/TIKA-3742?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17529101#comment-17529101 ]
Nick Burch commented on TIKA-3742: ---------------------------------- Assuming we just want type=17 text elements of a DGNv7 file (as per [http://dgnlib.maptools.org/dgn.html#type17] ) then a quick'n'dirty parser wouldn't be too bad.... [https://gist.github.com/Gagravarr/90d390fec7c5f2c5cf966c0eedccac5c] is a basic reader that finds these texts elements and prints them Couldn't immediately spot any useful metadata elements to pull out, so I think a basic parser would just be the text for DGN7 Anyone fancy finishing this off into a "proper" Tika parser? :) > Advice around DGN7 parser and whether to add to TIKA > ---------------------------------------------------- > > Key: TIKA-3742 > URL: https://issues.apache.org/jira/browse/TIKA-3742 > Project: Tika > Issue Type: Task > Components: parser > Reporter: Dan Coldrick > Priority: Minor > Attachments: DGN.zip, ExampleOutput.txt > > > Hi [~tallison] & Whoever else. > I managed to compile the C/C++ library [http://dgnlib.maptools.org/] for > DGN7 which produces an dgndump.exe which will dump all the data from the DGN. > From my initial testing it looks pretty good. > Would you guys think it was worth adding this or just keep it as a custom > parser rather than in the main source code? It's under MIT license. I've > attached the exe (zipped), a copy of the output from the dump and my very > dirty testing calling the exe (my code I was only interested in the Strings > so am only pulling those into a string array at the moment to check it's > pulling out the correct data). -- This message was sent by Atlassian Jira (v8.20.7#820007)