Setting up an Apache Tika Meetup [was: Recording/Streaming Apache Tika Virtual Meetings to YouTube]
All, Unless there are objections, I'll set up an Apache Tika Meetup later today. If there are better media options, let me know. Best, Tim On Thu, Oct 14, 2021 at 12:05 PM Tim Allison wrote: > > Lewis, > Thank you for getting the ball rolling on this. I think it would be > great to have semi-regular meetings of devs and/or community outreach. > For example, I'd like to host an outreachy, tika-eval deep dive for > [1]. I can think of a few other outreachy topics, especially around > migrating to 2.x. > Any objections if I started a Meetup group? Did we ever settle on a > platform? > > Cheers, > > Tim > > > [1] https://www.dpconline.org/events/world-digital-preservation-day > > On Wed, May 19, 2021 at 1:57 PM lewis john mcgibbney > wrote: > > > > Hi Swapnil, > > Excellent., Thank you. Replies inline below > > > > On Wed, May 19, 2021 at 9:53 AM Swapnil M Mane > > wrote: > > > > > > > > If it is a community meetup where the participant has active > > > involvement in conversation, we should not go for YouTube live. > > > > > > > It IS a community meetup participants actively engage in and trade > > conversation and opinions. So it sounds like YouTube live is not the > > correct solution. > > > > > > > One of the popular tool used for live streams is Streamyard. You can > > > find more details here [1]. > > > > > > > I had never heard of it, thanks for the pointer. > > > > > > > > > > By the way, which tool community used for the last meeting (Zoom, > > > Google meet or something else)? > > > > > > The meeting was hosted on a paid version of WebEx. It would be great if we > > could move away from this for the next meeting. > > > > lewismc
[jira] [Commented] (TIKA-3575) Cannot use loadErrorHandler="ignore" in tika config
[ https://issues.apache.org/jira/browse/TIKA-3575?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17429131#comment-17429131 ] Andreas Hubold commented on TIKA-3575: -- Thanks [~tallison], I'd suggest to * either change the default for loadErrorHandler in TikaConfig#serviceLoaderFromDomElement back to IGNORE. (this would be my preferred choice, and a very simple change) * or keep the default at THROW but extend #serviceLoaderFromDomElement to check for a value of "ignore" in the attribute and respect that. And if the default is THROW now, it should also be the default if no service-loader element specified, otherwise it feels inconsistent and could surprise users. If you search for org.apache.tika.config.LoadErrorHandler#IGNORE, you can see that it's still the default at some places. {quote}The goal was to allow finer-grained module selection so that you'd never have load errors that you'd want to ignore. {quote} I really like the separation into modules in Tika 2.x. That's a great improvement! Our use case for LoadErrorHandler#IGNORE: It can still be useful to include a module but exclude some of its parsers/dependencies. For example we're using tika-parser-code-module but just don't need Matlab and SAS7BDATParser, so we want to exclude parso and jmatio dependencies to reduce the number of dependencies. It's a nice feature that this disables the parsers without additional necessary configuration in tika config (and our downstream users could simply add dependencies to enable parsers without touching configuration). I think it's a good idea to bundle different parsers into logical modules, like different code parsers in tika-parser-code-modules. But sometimes that may not be fine-grained enough, and that's where LoadErrorHandler#IGNORE plays a nice role, IMHO. > Cannot use loadErrorHandler="ignore" in tika config > --- > > Key: TIKA-3575 > URL: https://issues.apache.org/jira/browse/TIKA-3575 > Project: Tika > Issue Type: Bug > Components: config >Affects Versions: 2.0.0, 2.1.0 >Reporter: Andreas Hubold >Priority: Major > Labels: regression > > Tika 2.0.0 changed the default error handler to throw exceptions, and does > not ignore errors when loading parsers anymore as it was the case with Tika > 1.x. > See > [https://github.com/apache/tika/commit/e47c6cd62e587fdaae7e2e999f37122d09449754#diff-3955d56f4d95c6e600966c486c58f92483c900d32d553d18b3cf2940cbf2c768R470|https://github.com/apache/tika/commit/e47c6cd62e587fdaae7e2e999f37122d09449754#diff-3955d56f4d95c6e600966c486c58f92483c900d32d553d18b3cf2940cbf2c768R470] > There's no configuration option to restore the previous behavior. It should > be possible to set > {code} > > {code} > but the code in org.apache.tika.config.TikaConfig#serviceLoaderFromDomElement > only considers "warn" and "throw" as possible values. > > > -- This message was sent by Atlassian Jira (v8.3.4#803005)