[GitHub] [tika] PeterAlfredLee commented on pull request #357: Modify TikaInputStream

2020-09-23 Thread GitBox
PeterAlfredLee commented on pull request #357: URL: https://github.com/apache/tika/pull/357#issuecomment-698140750 @tballison >why this hasn't caused problems before Because method `getPath` use original InputStream instead of current TikaInputStream when copy data to a temporary

[GitHub] [tika] PeterAlfredLee opened a new pull request #363: Modify TikaInputStream

2020-09-23 Thread GitBox
PeterAlfredLee opened a new pull request #363: URL: https://github.com/apache/tika/pull/363 1. In method `getPath`, copy data from InputStream to a temporary file should use current TikaInputStream. Otherwise the variable of TikaInputStream `position` will not update. There is n

[GitHub] [tika] PeterAlfredLee commented on pull request #356: Attempt to read zips with STORED data descriptors

2020-09-23 Thread GitBox
PeterAlfredLee commented on pull request #356: URL: https://github.com/apache/tika/pull/356#issuecomment-698075634 > In PackageParser if the ArchiveInputStream is a ZipArchiveInputStream, iterate through all the zip entries without trying to parse/read them. This will make it so we do not

[GitHub] [tika] PeterAlfredLee commented on a change in pull request #356: Attempt to read zips with STORED data descriptors

2020-09-23 Thread GitBox
PeterAlfredLee commented on a change in pull request #356: URL: https://github.com/apache/tika/pull/356#discussion_r493998557 ## File path: tika-parser-modules/tika-parser-pkg-module/src/main/java/org/apache/tika/parser/pkg/PackageParser.java ## @@ -287,6 +284,35 @@ public voi

[GitHub] [tika] PeterAlfredLee commented on pull request #356: Attempt to read zips with STORED data descriptors

2020-09-23 Thread GitBox
PeterAlfredLee commented on pull request #356: URL: https://github.com/apache/tika/pull/356#issuecomment-698067189 I updated a little bit with the DD signature part. This works with Compress 1.20 and works fine with this PR. ``` private InputStream forgeZipInputStream() throws

[GitHub] [tika] PeterAlfredLee commented on pull request #356: Attempt to read zips with STORED data descriptors

2020-09-23 Thread GitBox
PeterAlfredLee commented on pull request #356: URL: https://github.com/apache/tika/pull/356#issuecomment-698066699 > @PeterAlfredLee - I just put that zip generation code into the ZipParserTest but it's throwing a 'Truncated Zip File' exception when run. Oops. I forgot to add the Dat

[jira] [Commented] (TIKA-3196) PackageParser should attempt to parse entries from zip files with STORED entries with data descriptor

2020-09-23 Thread Trevor Bentley (Jira)
[ https://issues.apache.org/jira/browse/TIKA-3196?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17200878#comment-17200878 ] Trevor Bentley commented on TIKA-3196: -- [~tallison] [~peterlee] FYI - the zip archiv

[GitHub] [tika] tbentleypfpt edited a comment on pull request #356: Attempt to read zips with STORED data descriptors

2020-09-23 Thread GitBox
tbentleypfpt edited a comment on pull request #356: URL: https://github.com/apache/tika/pull/356#issuecomment-697478056 @PeterAlfredLee - I just put that zip generation code into the ZipParserTest but it's throwing a 'Truncated Zip File' exception when run. org.apache.tika.excep

[GitHub] [tika] tbentleypfpt commented on pull request #356: Attempt to read zips with STORED data descriptors

2020-09-23 Thread GitBox
tbentleypfpt commented on pull request #356: URL: https://github.com/apache/tika/pull/356#issuecomment-697478056 @PeterAlfredLee - I just put that zip generation code into the ZipParserTest but it's throwing a 'Truncated Zip File' exception when run. `org.apache.tika.exception.TikaEx

[GitHub] [tika] tbentleypfpt edited a comment on pull request #356: Attempt to read zips with STORED data descriptors

2020-09-23 Thread GitBox
tbentleypfpt edited a comment on pull request #356: URL: https://github.com/apache/tika/pull/356#issuecomment-697478056 @PeterAlfredLee - I just put that zip generation code into the ZipParserTest but it's throwing a 'Truncated Zip File' exception when run. org.apache.tika.exception.

[GitHub] [tika] tbentleypfpt commented on pull request #356: Attempt to read zips with STORED data descriptors

2020-09-23 Thread GitBox
tbentleypfpt commented on pull request #356: URL: https://github.com/apache/tika/pull/356#issuecomment-697456940 I have added the resetting of the stream and added some unit tests. Though I will add in the zip generation code from @PeterAlfredLee in the tests instead of adding a zip archiv