Re: high level parser module names in 2.x

2021-05-11 Thread Eric Pugh
Sounds good to me. On Tue, May 11, 2021 at 9:33 AM Tim Allison wrote: > If there aren't objections, I'll make this change today or tomorrow. > > Cheers, > >Tim > > On Tue, Apr 20, 2021 at 10:57 AM Tim Allison wrote: > > > > How about: > > > > standard > > extended > > ml (for machin

Re: high level parser module names in 2.x

2021-05-11 Thread Tim Allison
If there aren't objections, I'll make this change today or tomorrow. Cheers, Tim On Tue, Apr 20, 2021 at 10:57 AM Tim Allison wrote: > > How about: > > standard > extended > ml (for machine learning) > > On Wed, Mar 10, 2021 at 10:37 AM Nick Burch wrote: > > > > On Tue, 9 Mar 2021,

Re: high level parser module names in 2.x

2021-04-20 Thread Tim Allison
How about: standard extended ml (for machine learning) On Wed, Mar 10, 2021 at 10:37 AM Nick Burch wrote: > > On Tue, 9 Mar 2021, Tim Allison wrote: > > Would this be better? > > > > tika-parsers-basic > > tika-parsers-complex > > tika-parsers-¯\_(ツ)_/¯ > > GStreamer has 4 levels of plugins, Bas

Re: high level parser module names in 2.x

2021-03-10 Thread Nick Burch
On Tue, 9 Mar 2021, Tim Allison wrote: Would this be better? tika-parsers-basic tika-parsers-complex tika-parsers-¯\_(ツ)_/¯ GStreamer has 4 levels of plugins, Base, Good, Ugly and Bad. Descriptions of what qualifies for what at https://gstreamer.freedesktop.org/modules/ . I can see developer

Re: high level parser module names in 2.x

2021-03-09 Thread Tim Allison
Eric, Would this be better? tika-parsers-basic tika-parsers-complex tika-parsers-¯\_(ツ)_/¯ On Tue, Mar 9, 2021 at 12:25 PM Eric Pugh wrote: > > I’d like to see the discriminators on the parsers be more about the type of > parser, and what it’s going to drag along/impact my system with, and the

Re: high level parser module names in 2.x

2021-03-09 Thread Eric Pugh
I’d like to see the discriminators on the parsers be more about the type of parser, and what it’s going to drag along/impact my system with, and these names reflect more the history of Tika’s evolution. Starting with the descriptive paragraphs, here is some brainstorming of names: with the exce

Re: high level parser module names in 2.x

2021-03-09 Thread Ken Krugler
Oooh, this sounds like a great opportunity for some bike-shedding :) If I had my druthers, I would organize as: tika-parsers-standard: whatever is required to extract text and metadata from 80%+ of stand-alone documents found on the web. tika-parsers-archives: zip, pkg, etc. tika-parsers-ocr: wh

high level parser module names in 2.x

2021-03-09 Thread Tim Allison
All, I was recently chatting about Tika 2.x with some Tika friends and they had some hesitation about the names for the three high level parser modules. They are currently: tika-parsers-classic tika-parsers-extended tika-parsers-advanced The quibbles weren't with the delineation, but with the