[jira] [Commented] (TIKA-3218) Wrong comment for method sortLoadedClasses in ServiceLoaderUtils

2020-11-05 Thread Peter Lee (Jira)
[ https://issues.apache.org/jira/browse/TIKA-3218?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17227108#comment-17227108 ] Peter Lee commented on TIKA-3218: - _so that user-provided ones would come first and would be able to

[GitHub] [tika] PeterAlfredLee commented on pull request #372: Modify some calls of method Collection.toArray

2020-11-05 Thread GitBox
PeterAlfredLee commented on pull request #372: URL: https://github.com/apache/tika/pull/372#issuecomment-722748153 I'm glad this helps a little bit. :) This is an automated message from the Apache Git Service. To respond to

[jira] [Comment Edited] (TIKA-3221) /rmeta/text endpoint - allow a "max parse time" parameter where after exceeded, return bytes/metadata mangaed to get up to that point

2020-11-05 Thread Jira
[ https://issues.apache.org/jira/browse/TIKA-3221?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17227095#comment-17227095 ] Luís Filipe Nassif edited comment on TIKA-3221 at 11/6/20, 1:04 AM: My

[jira] [Commented] (TIKA-3221) /rmeta/text endpoint - allow a "max parse time" parameter where after exceeded, return bytes/metadata mangaed to get up to that point

2020-11-05 Thread Jira
[ https://issues.apache.org/jira/browse/TIKA-3221?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17227095#comment-17227095 ] Luís Filipe Nassif commented on TIKA-3221: -- My 2 cents, in the past I had

[GitHub] [tika] kkrugler merged pull request #372: Modify some calls of method Collection.toArray

2020-11-05 Thread GitBox
kkrugler merged pull request #372: URL: https://github.com/apache/tika/pull/372 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the

[GitHub] [tika] kkrugler commented on pull request #372: Modify some calls of method Collection.toArray

2020-11-05 Thread GitBox
kkrugler commented on pull request #372: URL: https://github.com/apache/tika/pull/372#issuecomment-722693751 BTW @PeterAlfredLee - thanks for the ref to https://shipilev.net/blog/2016/arrays-wisdom-ancients/, fun stuff.

[GitHub] [tika] kkrugler commented on pull request #369: Use IOException instead of IOExceptionWithCause

2020-11-05 Thread GitBox
kkrugler commented on pull request #369: URL: https://github.com/apache/tika/pull/369#issuecomment-722693051 Hi @tballison - you said: > I, frankly, want to keep some subclass of IOException around whether that's IOExceptionWithCause or something else. My reasoning is that we have

[GitHub] [tika] kkrugler merged pull request #376: Simplify some sort statement use List.sort and lambda

2020-11-05 Thread GitBox
kkrugler merged pull request #376: URL: https://github.com/apache/tika/pull/376 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the

[GitHub] [tika] kkrugler merged pull request #375: Add CompareUtils and simplify some sort

2020-11-05 Thread GitBox
kkrugler merged pull request #375: URL: https://github.com/apache/tika/pull/375 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the

[jira] [Commented] (TIKA-3221) /rmeta/text endpoint - allow a "max parse time" parameter where after exceeded, return bytes/metadata mangaed to get up to that point

2020-11-05 Thread Tim Allison (Jira)
[ https://issues.apache.org/jira/browse/TIKA-3221?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17226952#comment-17226952 ] Tim Allison commented on TIKA-3221: --- There may still be some gotchas...but we'll see. > /rmeta/text

[jira] [Commented] (TIKA-3221) /rmeta/text endpoint - allow a "max parse time" parameter where after exceeded, return bytes/metadata mangaed to get up to that point

2020-11-05 Thread Nicholas DiPiazza (Jira)
[ https://issues.apache.org/jira/browse/TIKA-3221?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17226951#comment-17226951 ] Nicholas DiPiazza commented on TIKA-3221: - that would be amazing! that will drastically improve my

[jira] [Commented] (TIKA-3221) /rmeta/text endpoint - allow a "max parse time" parameter where after exceeded, return bytes/metadata mangaed to get up to that point

2020-11-05 Thread Tim Allison (Jira)
[ https://issues.apache.org/jira/browse/TIKA-3221?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17226949#comment-17226949 ] Tim Allison commented on TIKA-3221: --- Ah, ok. If you can use tika-server, that'd be great for a larger

[jira] [Comment Edited] (TIKA-3221) /rmeta/text endpoint - allow a "max parse time" parameter where after exceeded, return bytes/metadata mangaed to get up to that point

2020-11-05 Thread Nicholas DiPiazza (Jira)
[ https://issues.apache.org/jira/browse/TIKA-3221?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17226940#comment-17226940 ] Nicholas DiPiazza edited comment on TIKA-3221 at 11/5/20, 7:33 PM: --- Yep

[jira] [Comment Edited] (TIKA-3221) /rmeta/text endpoint - allow a "max parse time" parameter where after exceeded, return bytes/metadata mangaed to get up to that point

2020-11-05 Thread Nicholas DiPiazza (Jira)
[ https://issues.apache.org/jira/browse/TIKA-3221?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17226940#comment-17226940 ] Nicholas DiPiazza edited comment on TIKA-3221 at 11/5/20, 7:31 PM: --- Yep

[jira] [Commented] (TIKA-3221) /rmeta/text endpoint - allow a "max parse time" parameter where after exceeded, return bytes/metadata mangaed to get up to that point

2020-11-05 Thread Nicholas DiPiazza (Jira)
[ https://issues.apache.org/jira/browse/TIKA-3221?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17226940#comment-17226940 ] Nicholas DiPiazza commented on TIKA-3221: - Yep this was a great idea "in theory" but in practice,

[jira] [Commented] (TIKA-3221) /rmeta/text endpoint - allow a "max parse time" parameter where after exceeded, return bytes/metadata mangaed to get up to that point

2020-11-05 Thread Tim Allison (Jira)
[ https://issues.apache.org/jira/browse/TIKA-3221?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17226908#comment-17226908 ] Tim Allison commented on TIKA-3221: --- ConcurrentHashMap doesn't appear to add any overhead for our

[jira] [Commented] (TIKA-3221) /rmeta/text endpoint - allow a "max parse time" parameter where after exceeded, return bytes/metadata mangaed to get up to that point

2020-11-05 Thread Tim Allison (Jira)
[ https://issues.apache.org/jira/browse/TIKA-3221?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17226906#comment-17226906 ] Tim Allison commented on TIKA-3221: --- [~ndipiazza_gmail] how did you avoid

[jira] [Commented] (TIKA-3220) ForkParser displays incorrect message when parse timeout is reached

2020-11-05 Thread Nicholas DiPiazza (Jira)
[ https://issues.apache.org/jira/browse/TIKA-3220?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17226837#comment-17226837 ] Nicholas DiPiazza commented on TIKA-3220: - Yeah i was thinking #1 when i opened the ticket.

[jira] [Commented] (TIKA-3218) Wrong comment for method sortLoadedClasses in ServiceLoaderUtils

2020-11-05 Thread Nick Burch (Jira)
[ https://issues.apache.org/jira/browse/TIKA-3218?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17226820#comment-17226820 ] Nick Burch commented on TIKA-3218: -- I think the idea of this was so that eg Parsers would have

[jira] [Commented] (TIKA-3221) /rmeta/text endpoint - allow a "max parse time" parameter where after exceeded, return bytes/metadata mangaed to get up to that point

2020-11-05 Thread Tim Allison (Jira)
[ https://issues.apache.org/jira/browse/TIKA-3221?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17226804#comment-17226804 ] Tim Allison commented on TIKA-3221: --- {noformat} Perhaps when checking byte size, periodically check time

[jira] [Commented] (TIKA-3221) /rmeta/text endpoint - allow a "max parse time" parameter where after exceeded, return bytes/metadata mangaed to get up to that point

2020-11-05 Thread Tim Allison (Jira)
[ https://issues.apache.org/jira/browse/TIKA-3221?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17226802#comment-17226802 ] Tim Allison commented on TIKA-3221: --- [~ndipiazza_gmail], if we add this capability to tika-server, will

Re: [PMCs] Ramping up for Google Summer of Code 2021: invitation to participate

2020-11-05 Thread lewis john mcgibbney
Hi dev@, Is anyone interested in co-mentoring https://issues.apache.org/jira/browse/TIKA-94 ? Lewis On Mon, Nov 2, 2020 at 7:52 PM Sally Khudairi wrote: > Hello PMCs --I hope you are all well. > > ASF Community Development (ComDev) oversees our participation in Google > Summer of Code, for

[jira] [Commented] (TIKA-3220) ForkParser displays incorrect message when parse timeout is reached

2020-11-05 Thread Tim Allison (Jira)
[ https://issues.apache.org/jira/browse/TIKA-3220?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17226797#comment-17226797 ] Tim Allison commented on TIKA-3220: --- I'll clean up the unit tests to make it clearer that the existing