[jira] [Commented] (TIKA-1195) XLSB support
[ https://issues.apache.org/jira/browse/TIKA-1195?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15974972#comment-15974972 ] Hudson commented on TIKA-1195: -- SUCCESS: Integrated in Jenkins build tika-2.x-windows #198 (See [https://builds.apache.org/job/tika-2.x-windows/198/]) TIKA-1195 and TIKA-2329, upgrade to POI 3.16-final and add xlsb parser (tallison: rev a847a863d1e25a9ba8209cd28c3e98be153f34a5) * (edit) tika-parser-modules/tika-parser-office-module/src/main/java/org/apache/tika/parser/microsoft/ooxml/OOXMLParser.java * (edit) tika-parser-modules/tika-parser-office-module/src/test/java/org/apache/tika/parser/microsoft/ooxml/OOXMLParserTest.java * (edit) tika-parser-modules/tika-parser-office-module/src/test/java/org/apache/tika/parser/microsoft/ExcelParserTest.java * (edit) tika-parser-modules/tika-parser-office-module/src/main/java/org/apache/tika/parser/microsoft/ooxml/XSSFExcelExtractorDecorator.java * (edit) CHANGES.txt * (edit) tika-parser-modules/tika-parser-office-module/src/main/java/org/apache/tika/parser/microsoft/ooxml/OOXMLExtractorFactory.java * (edit) tika-parser-modules/pom.xml * (add) tika-test-resources/src/test/resources/test-documents/testEXCEL_various.xlsb * (add) tika-parser-modules/tika-parser-office-module/src/main/java/org/apache/tika/parser/microsoft/ooxml/XSSFBExcelExtractorDecorator.java > XLSB support > > > Key: TIKA-1195 > URL: https://issues.apache.org/jira/browse/TIKA-1195 > Project: Tika > Issue Type: Improvement > Components: general >Affects Versions: 1.4 > Environment: W2008R2 >Reporter: Frederic Ronny > Labels: new-parser > Fix For: 2.0, 1.15 > > > We use Manifoldcf 1.3 and Solr 4.4 to index a shared network drive, works > fine for most of our Office filetypes ( docx, xlsx, ) but we also have a > lot of files with filetype xlsb which are not in the supported filetypes. > In order to keep using this solution it is essential to us that there will be > a solution provided in the future -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (TIKA-1195) XLSB support
[ https://issues.apache.org/jira/browse/TIKA-1195?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15974823#comment-15974823 ] Hudson commented on TIKA-1195: -- SUCCESS: Integrated in Jenkins build Tika-trunk #1240 (See [https://builds.apache.org/job/Tika-trunk/1240/]) TIKA-1195 and TIKA-2329 (tallison: [https://github.com/apache/tika/commit/67612b8f805ad5d1085db14922d3b3b6ddce19bf]) * (edit) tika-parsers/src/main/java/org/apache/tika/parser/microsoft/ooxml/OOXMLParser.java * (edit) tika-parsers/src/test/java/org/apache/tika/parser/microsoft/ExcelParserTest.java * (edit) tika-parsers/src/main/java/org/apache/tika/parser/microsoft/ooxml/OOXMLExtractorFactory.java * (edit) CHANGES.txt * (edit) tika-parsers/pom.xml * (add) tika-parsers/src/main/java/org/apache/tika/parser/microsoft/ooxml/XSSFBExcelExtractorDecorator.java * (edit) tika-parsers/src/test/java/org/apache/tika/parser/microsoft/ooxml/OOXMLParserTest.java * (edit) tika-parsers/src/main/java/org/apache/tika/parser/microsoft/ooxml/XSSFExcelExtractorDecorator.java > XLSB support > > > Key: TIKA-1195 > URL: https://issues.apache.org/jira/browse/TIKA-1195 > Project: Tika > Issue Type: Improvement > Components: general >Affects Versions: 1.4 > Environment: W2008R2 >Reporter: Frederic Ronny > Labels: new-parser > > We use Manifoldcf 1.3 and Solr 4.4 to index a shared network drive, works > fine for most of our Office filetypes ( docx, xlsx, ) but we also have a > lot of files with filetype xlsb which are not in the supported filetypes. > In order to keep using this solution it is essential to us that there will be > a solution provided in the future -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (TIKA-1195) XLSB support
[ https://issues.apache.org/jira/browse/TIKA-1195?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15929700#comment-15929700 ] Tim Allison commented on TIKA-1195: --- I think so, but It Depends (TM). I figure the next version of POI will be out in early/mid April, and then we could shoot for 1.15. All depends on our communities of devs, though. Next version of PDFBox should be out within the next few days. > XLSB support > > > Key: TIKA-1195 > URL: https://issues.apache.org/jira/browse/TIKA-1195 > Project: Tika > Issue Type: Improvement > Components: general >Affects Versions: 1.4 > Environment: W2008R2 >Reporter: Frederic Ronny > Labels: new-parser > > We use Manifoldcf 1.3 and Solr 4.4 to index a shared network drive, works > fine for most of our Office filetypes ( docx, xlsx, ) but we also have a > lot of files with filetype xlsb which are not in the supported filetypes. > In order to keep using this solution it is essential to us that there will be > a solution provided in the future -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (TIKA-1195) XLSB support
[ https://issues.apache.org/jira/browse/TIKA-1195?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15929105#comment-15929105 ] Matthew Caruana Galizia commented on TIKA-1195: --- [~talli...@mitre.org] d'you reckon that will be out with Tika 1.15? > XLSB support > > > Key: TIKA-1195 > URL: https://issues.apache.org/jira/browse/TIKA-1195 > Project: Tika > Issue Type: Improvement > Components: general >Affects Versions: 1.4 > Environment: W2008R2 >Reporter: Frederic Ronny > Labels: new-parser > > We use Manifoldcf 1.3 and Solr 4.4 to index a shared network drive, works > fine for most of our Office filetypes ( docx, xlsx, ) but we also have a > lot of files with filetype xlsb which are not in the supported filetypes. > In order to keep using this solution it is essential to us that there will be > a solution provided in the future -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (TIKA-1195) XLSB support
[ https://issues.apache.org/jira/browse/TIKA-1195?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15928636#comment-15928636 ] Tim Allison commented on TIKA-1195: --- Basic xlsb streaming/read-only support was just added and should be available with 3.15-beta3. > XLSB support > > > Key: TIKA-1195 > URL: https://issues.apache.org/jira/browse/TIKA-1195 > Project: Tika > Issue Type: Improvement > Components: general >Affects Versions: 1.4 > Environment: W2008R2 >Reporter: Frederic Ronny > Labels: new-parser > > We use Manifoldcf 1.3 and Solr 4.4 to index a shared network drive, works > fine for most of our Office filetypes ( docx, xlsx, ) but we also have a > lot of files with filetype xlsb which are not in the supported filetypes. > In order to keep using this solution it is essential to us that there will be > a solution provided in the future -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (TIKA-1195) XLSB support
[ https://issues.apache.org/jira/browse/TIKA-1195?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15898084#comment-15898084 ] Tim Allison commented on TIKA-1195: --- https://bz.apache.org/bugzilla/show_bug.cgi?id=60826 > XLSB support > > > Key: TIKA-1195 > URL: https://issues.apache.org/jira/browse/TIKA-1195 > Project: Tika > Issue Type: Improvement > Components: general >Affects Versions: 1.4 > Environment: W2008R2 >Reporter: Frederic Ronny > Labels: new-parser > > We use Manifoldcf 1.3 and Solr 4.4 to index a shared network drive, works > fine for most of our Office filetypes ( docx, xlsx, ) but we also have a > lot of files with filetype xlsb which are not in the supported filetypes. > In order to keep using this solution it is essential to us that there will be > a solution provided in the future -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (TIKA-1195) XLSB support
[ https://issues.apache.org/jira/browse/TIKA-1195?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15897353#comment-15897353 ] Tim Allison commented on TIKA-1195: --- >From [~mcaruanagalizia] via twitter, an ASL 2.0 licensed javascript xlsb >parser: https://github.com/SheetJS/js-xlsx > XLSB support > > > Key: TIKA-1195 > URL: https://issues.apache.org/jira/browse/TIKA-1195 > Project: Tika > Issue Type: Improvement > Components: general >Affects Versions: 1.4 > Environment: W2008R2 >Reporter: Frederic Ronny > Labels: new-parser > > We use Manifoldcf 1.3 and Solr 4.4 to index a shared network drive, works > fine for most of our Office filetypes ( docx, xlsx, ) but we also have a > lot of files with filetype xlsb which are not in the supported filetypes. > In order to keep using this solution it is essential to us that there will be > a solution provided in the future -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (TIKA-1195) XLSB support
[ https://issues.apache.org/jira/browse/TIKA-1195?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15083679#comment-15083679 ] Dominik Stadler commented on TIKA-1195: --- Some official description is at https://msdn.microsoft.com/en-us/library/office/cc313133(v=office.12).aspx > XLSB support > > > Key: TIKA-1195 > URL: https://issues.apache.org/jira/browse/TIKA-1195 > Project: Tika > Issue Type: Improvement > Components: general >Affects Versions: 1.4 > Environment: W2008R2 >Reporter: Frederic Ronny > Labels: new-parser > > We use Manifoldcf 1.3 and Solr 4.4 to index a shared network drive, works > fine for most of our Office filetypes ( docx, xlsx, ) but we also have a > lot of files with filetype xlsb which are not in the supported filetypes. > In order to keep using this solution it is essential to us that there will be > a solution provided in the future -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (TIKA-1195) XLSB support
[ https://issues.apache.org/jira/browse/TIKA-1195?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14364023#comment-14364023 ] Nick Burch commented on TIKA-1195: -- No POI support as yet - will take a non-trivial amount of work (single digit days at minimum, maybe just into double digit days) to support it properly, and I'd guess at least a day for a hacky solution. Thus far, no takers to sponsor/do the work involved XLSB support Key: TIKA-1195 URL: https://issues.apache.org/jira/browse/TIKA-1195 Project: Tika Issue Type: Improvement Components: general Affects Versions: 1.4 Environment: W2008R2 Reporter: Frederic Ronny Labels: new-parser We use Manifoldcf 1.3 and Solr 4.4 to index a shared network drive, works fine for most of our Office filetypes ( docx, xlsx, ) but we also have a lot of files with filetype xlsb which are not in the supported filetypes. In order to keep using this solution it is essential to us that there will be a solution provided in the future -- This message was sent by Atlassian JIRA (v6.3.4#6332)