Given that we don't ship the code or binaries that involve that python
library, do we need to care about the license? I'm skeptical of hand rolled
regex and would rather favour either of the libraries Jan mentioned. Just
my two cents.

On Sat, 18 Sep, 2021, 12:02 am Mike Drob, <md...@mdrob.com> wrote:

> The second library you linked, Jan, is AGPL. Thank you for continuing to
> look for alternatives.
>
> I have some regular expressions cooked up locally that I think will let us
> read the split lines going forward, and will put up the patch shortly.
>
> On Fri, Sep 17, 2021 at 7:45 AM Yuval Paz <yuval.p...@mail.huji.ac.il>
> wrote:
>
>> Not sure if this is something can be changed easily, but if the problem
>> is caused by some parsers don't know how to parse line wrapping in the
>> middle of the Hash why not moving the hash completely to the new line (the
>> specification allow new line at any point in the value)?
>>
>> The commit hash + date comes out to be exactly 71 bytes (including the
>> space at the start), and it should be a constant size, and by the time the
>> version will reach 48 bytes we all be probably dead
>>
>> On Fri, Sep 17, 2021, 2:47 PM Robert Muir <rcm...@gmail.com> wrote:
>>
>>> Sure, but that package is archived/read-only, GPLv3. with 3 watchers and
>>> 1 star.
>>>
>>> On Fri, Sep 17, 2021 at 4:27 AM Jan Høydahl <jan....@cominvent.com>
>>> wrote:
>>> >
>>> > Let's just follow the spec and move on.
>>> >
>>> > Just tested this python package, which has no problem parsing the
>>> problematic manifest https://pypi.org/project/jarmanifest/
>>> >
>>> > >>> manifest.getAttributes("/tmp/lucene-manifest.mf")
>>> > [{'implementationversion': '9.0.0-SNAPSHOT
>>> de45b68c909815ce5ea7b6b9e1a2ce3375b6334d [snapshot build, details
>>> omitted]'}]
>>> >
>>> > Jan
>>> >
>>> > 17. sep. 2021 kl. 09:32 skrev Dawid Weiss <dawid.we...@gmail.com>:
>>> >
>>> >
>>> > We could do a few things to keep everyone happy -
>>> >
>>> > 1) keep abbreviated hash in the Implementat-Version and use a separate
>>> manifest entry to store a full hash.
>>> > 2) use a longer version for git show (abbrev=num) so that the chance
>>> of collisions in the future is minimized. It's still not a full hash but a
>>> > long(er) forced prefix.
>>> >
>>> > D.
>>> >
>>> > On Fri, Sep 17, 2021 at 12:21 AM Chris Hostetter <
>>> hossman_luc...@fucit.org> wrote:
>>> >>
>>> >>
>>> >> : I was referring to doing this with languages other than java.
>>> >> :
>>> >> : I'm also assuming that exceeding this limit is going to cause
>>> indirect
>>> >> : hassles for users of lucene, e.g. breaking various security / supply
>>> >> : chain tools. We know a lot of these are total crap but people in the
>>> >> : corporate world have to suffer under them.
>>> >>
>>> >> Just to be clear -- our 'Implementation-Version:' has been exceeding
>>> the
>>> >> 72 byte "single line" limit for a LOOOOONG time -- worrying about how
>>> >> "security / supply chain" tools will handle parsing that line now
>>> seems
>>> >> kind of silly...
>>> >>
>>> >> If tools can't handle a line wrap in the 8.10 jars, then they haven't
>>> >> been able to handle the line wrap since we switched from svn to git
>>> (when
>>> >> the Implementation Version values switched from being based svn
>>> version#
>>> >> to git sha)
>>> >>
>>> >> The *ONLY* thing that's new here is where in the value the line wrap
>>> >> happens (with 8.10.0 it happens in the middle of the SHA) and that our
>>> >> smoketest tool isn't smart enough to parse the values properly.
>>> >>
>>> >> This is not even the first time we've even had a conversation about
>>> the
>>> >> smoke tester and Implementation Version line wraps: LUCENE-7023.
>>> >>
>>> >>
>>> >> : Its super-easy to use a short hash here and avoid problems.
>>> >>
>>> >>
>>> >> There is however an actual and practical downside to switching our
>>> >> implementation version to using a "short" SHA, and that's that we
>>> would
>>> >> lose the ability to garuntee that the information in the
>>> >> Implementation-Version uniquely identifies what commit a given jar was
>>> >> built from.  Multiple commits with the same short(end) hash are
>>> possible
>>> >> -- Multiple commits with identical (full) commits is not.
>>> >>
>>> >> Folks may think that using git tags is useful enough for figuring this
>>> >> out from official releases, but being able to look at the jar metadata
>>> >> from arbitrary builds off of arbitrary branches and sanity check where
>>> >> exactly they come from has been very useful to me for 10+ years.
>>> >>
>>> >>
>>> >> : On Thu, Sep 16, 2021 at 3:03 AM Dawid Weiss <dawid.we...@gmail.com>
>>> wrote:
>>> >> : >
>>> >> : > Jar command doesn't have it, true. But it's fairly trivial to do,
>>> even
>>> >> : > with an inline snippet like this one?
>>> >> : >
>>> >> : > public class PrintManifest {
>>> >> : >   public static void main(String[] jars) throws IOException {
>>> >> : >     for (var jar : jars) {
>>> >> : >       var manifest = new
>>> JarFile(Paths.get(jar).toFile()).getManifest();
>>> >> : >       var attrs = manifest.getMainAttributes();
>>> >> : >       System.out.println(jar + ": " +
>>> attrs.getValue("Implementation-Version"));
>>> >> : >     }
>>> >> : >   }
>>> >> : > }
>>> >> : >
>>> >> : > I have this in my lucene-core-9.0.0-SNAPSHOT.jar:
>>> >> : >
>>> >> : > Implementation-Version: 9.0.0-SNAPSHOT
>>> de45b68c909815ce5ea7b6b9e1a2ce337
>>> >> : >  5b6334d [snapshot build, details omitted]
>>> >> : >
>>> >> : > and running:
>>> >> : >
>>> >> : > java PrintManifest.java lucene-core-9.0.0-SNAPSHOT.jar
>>> >> : >
>>> >> : > shows:
>>> >> : >
>>> >> : > lucene-core-9.0.0-SNAPSHOT.jar: 9.0.0-SNAPSHOT
>>> >> : > de45b68c909815ce5ea7b6b9e1a2ce3375b6334d [snapshot build, details
>>> >> : > omitted]
>>> >> : >
>>> >> : > This seems easier to me than trying to remember and keep the
>>> length of
>>> >> : > that line shorter than an arbitrary limit.
>>> >> : >
>>> >> : > Dawid
>>> >> : >
>>> >> : >
>>> >> : > On Wed, Sep 15, 2021 at 9:46 PM Robert Muir <rcm...@gmail.com>
>>> wrote:
>>> >> : > >
>>> >> : > > But its irrelevant that is "valid" when virtually no tools
>>> match it.
>>> >> : > >
>>> >> : > > In other words, I'd agree with you if the "jar" command had some
>>> >> : > > ability to read these manifests and print stuff to stdout, e.g.
>>> if
>>> >> : > > there was ANY interop at all here.
>>> >> : > >
>>> >> : > > But there isn't. So IMO it makes no sense to cause confusion
>>> and chaos
>>> >> : > > by adding an unnecessarily long git commit hash.
>>> >> : > >
>>> >> : > > On Wed, Sep 15, 2021 at 3:26 PM Dawid Weiss <
>>> dawid.we...@gmail.com> wrote:
>>> >> : > > >
>>> >> : > > >
>>> >> : > > > This is valid manifest line-breaking though... Can we read
>>> the manifest properly on the smoke tester side somehow (for example, run a
>>> Java process that reads and extracts the required attribute)? This way we
>>> wouldn't care about the implementation details of how manifest wraps the
>>> lines (or escapes characters).
>>> >> : > > >
>>> >> : > > > D.
>>> >> : > > >
>>> >> : > > > On Wed, Sep 15, 2021 at 8:46 PM Mike Drob <md...@mdrob.com>
>>> wrote:
>>> >> : > > >>
>>> >> : > > >> The benchmark jar has the info we need… sort of. When I
>>> built it, it has:
>>> >> : > > >>
>>> >> : > > >> Implementation-Version: 8.10.0
>>> 75a5061d3715cc5d93c4cbe4f1fa62bf035eea1
>>> >> : > > >>  1 - mdrob - 2021-09-15 11:40:36
>>> >> : > > >>
>>> >> : > > >>
>>> >> : > > >> and it’s looking for Implementation-Version: 8.10.0
>>> 75a5061d3715cc5d93c4cbe4f1fa62bf035eea11 on one line.
>>> >> : > > >>
>>> >> : > > >> Because 8.10 is a character longer than 8.9, we happen to
>>> wrap the last character of the git commit sha. From the manifest spec:
>>> >> : > > >>
>>> >> : > > >> No line may be longer than 72 bytes (not characters), in its
>>> UTF8-encoded form.
>>> >> : > > >> If a value would make the initial line longer than this, it
>>> should be continued
>>> >> : > > >> on extra lines (each starting with a single SPACE).
>>> >> : > > >>
>>> >> : > > >> And we were already teetering on the edge of that limit.
>>> We'll run into this problem again in a few years when we try to release
>>> version 10.0.0, so solving it now has practical benefits down the line.
>>> >> : > > >>
>>> >> : > > >> There's a few options that I can come up with -
>>> >> : > > >> 1. Use the short-hash when we generate the jar
>>> >> : > > >> 2. Use the short-hash when we check the contents in the
>>> smoke test
>>> >> : > > >> 3. Do some line join magic in the smoke test.
>>> >> : > > >>
>>> >> : > > >> I'm leaning towards number 1 as I feel that would still be
>>> unique enough for our needs, but would like to hear from others as well.
>>> >> : > > >>
>>> >> : > > >> On Wed, Sep 15, 2021 at 9:46 AM Timothy potter <
>>> thelabd...@gmail.com> wrote:
>>> >> : > > >>>
>>> >> : > > >>> can someone also please look into that benchmark jar issue?
>>> >> : > > >>>
>>> >> : > > >>> Sent from my iPhone
>>> >> : > > >>>
>>> >> : > > >>> On Sep 15, 2021, at 9:44 AM, Nhat Nguyen <
>>> nhat.ngu...@elastic.co.invalid> wrote:
>>> >> : > > >>>
>>> >> : > > >>> 
>>> >> : > > >>> Thanks Mayya and Mike! I will backport it to the 8.10
>>> branch.
>>> >> : > > >>>
>>> >> : > > >>> On Wed, Sep 15, 2021 at 10:12 AM Mike Drob <md...@mdrob.com>
>>> wrote:
>>> >> : > > >>>>
>>> >> : > > >>>> I think since Tim is out on vacation, it's probably not
>>> too late. That looks like a good fix to have, do we know how long the bug
>>> has been present?
>>> >> : > > >>>>
>>> >> : > > >>>> On Wed, Sep 15, 2021 at 7:56 AM Mayya Sharipova <
>>> mayya.sharip...@elastic.co.invalid> wrote:
>>> >> : > > >>>>>
>>> >> : > > >>>>> Hello everyone,
>>> >> : > > >>>>> We have discovered a bug and fixed a bug in Lucene sort
>>> optimization (LUCENE-10106) and would like to merge it to Lucene 8.10 if it
>>> is not too late.
>>> >> : > > >>>>> I apologize for the inconvenience, the bug was discovered
>>> just yesterday.
>>> >> : > > >>>>>
>>> >> : > > >>>>> On Tue, Sep 14, 2021 at 9:26 PM Timothy Potter <
>>> thelabd...@apache.org> wrote:
>>> >> : > > >>>>>>
>>> >> : > > >>>>>> Ahem ... unfortunately there will not be an 8.10 RC this
>>> week. I'm
>>> >> : > > >>>>>> headed out on vacation tomorrow, back at keys on Monday,
>>> Sept 20
>>> >> : > > >>>>>> unless someone else wants to pick up the RM duties
>>> before then?
>>> >> : > > >>>>>>
>>> >> : > > >>>>>> After failing the test suite at various places and other
>>> weirdness
>>> >> : > > >>>>>> like .asc files not getting created, I finally got to
>>> the smoke test
>>> >> : > > >>>>>> part, which is now failing with:
>>> >> : > > >>>>>>
>>> >> : > > >>>>>>   File
>>> "/Users/tjp/.lucene-releases/8.10.0/lucene-solr/dev-tools/scripts/smokeTestRelease.py",
>>> >> : > > >>>>>> line 176, in checkJARMetaData
>>> >> : > > >>>>>>     raise RuntimeError('%s is missing "%s" inside its
>>> >> : > > >>>>>> META-INF/MANIFEST.MF (wrong git revision?)' % \
>>> >> : > > >>>>>> RuntimeError: JAR file
>>> >> : > > >>>>>>
>>> "/Users/tjp/.lucene-releases/8.10.0/RC1/smoketest/unpack/lucene-8.10.0/benchmark/lucene-benchmark-8.10.0.jar"
>>> >> : > > >>>>>> is missing "Implementation-Version: 8.10.0
>>> >> : > > >>>>>> ecf5c747e6df418dd05a18af327c20051f0584d7" inside its
>>> >> : > > >>>>>> META-INF/MANIFEST.MF (wrong git revision?)
>>> >> : > > >>>>>>
>>> >> : > > >>>>>> FWIW, I verified that the other Lucene JAR files have
>>> this line in
>>> >> : > > >>>>>> them, such as core:
>>> >> : > > >>>>>>
>>> >> : > > >>>>>> Manifest-Version: 1.0
>>> >> : > > >>>>>> Ant-Version: Apache Ant 1.9.15
>>> >> : > > >>>>>> Created-By: 1.8.0_265-b01 (AppleJDK-8.0.265.1.1)
>>> >> : > > >>>>>> Extension-Name: org.apache.lucene
>>> >> : > > >>>>>> Specification-Title: Lucene Search Engine: core
>>> >> : > > >>>>>> Specification-Version: 8.10.0
>>> >> : > > >>>>>> Specification-Vendor: The Apache Software Foundation
>>> >> : > > >>>>>> Implementation-Title: org.apache.lucene
>>> >> : > > >>>>>> Implementation-Version: 8.10.0
>>> ecf5c747e6df418dd05a18af327c20051f0584d
>>> >> : > > >>>>>>  7 - tjp - 2021-09-14 19:08:42
>>> >> : > > >>>>>> Implementation-Vendor: The Apache Software Foundation
>>> >> : > > >>>>>> X-Compile-Source-JDK: 8
>>> >> : > > >>>>>> X-Compile-Target-JDK: 8
>>> >> : > > >>>>>> Multi-Release: true
>>> >> : > > >>>>>>
>>> >> : > > >>>>>> On Tue, Sep 14, 2021 at 1:21 PM Ishan Chattopadhyaya
>>> >> : > > >>>>>> <ichattopadhy...@gmail.com> wrote:
>>> >> : > > >>>>>> >
>>> >> : > > >>>>>> > All the best, this is the worst step.
>>> >> : > > >>>>>> >
>>> >> : > > >>>>>> > On Tue, 14 Sep, 2021, 10:47 pm Timothy Potter, <
>>> thelabd...@gmail.com> wrote:
>>> >> : > > >>>>>> >>
>>> >> : > > >>>>>> >> Building RC1 now ... stay tuned.
>>> >> : > > >>>>>> >>
>>> >> : > > >>>>>> >> On Thu, Sep 9, 2021 at 2:30 PM Timothy Potter <
>>> thelabd...@gmail.com> wrote:
>>> >> : > > >>>>>> >> >
>>> >> : > > >>>>>> >> > Thanks for the update Mike!
>>> >> : > > >>>>>> >> >
>>> >> : > > >>>>>> >> > I'm backporting SOLR-15620 right now and am cooking
>>> up a quick PR for
>>> >> : > > >>>>>> >> > SOLR-15621, which looks like an easy win for the
>>> issue Cassandra
>>> >> : > > >>>>>> >> > reported on Slack earlier today.
>>> >> : > > >>>>>> >> >
>>> >> : > > >>>>>> >> > Cheers,
>>> >> : > > >>>>>> >> > Tim
>>> >> : > > >>>>>> >> >
>>> >> : > > >>>>>> >> > On Thu, Sep 9, 2021 at 11:32 AM Mike Drob <
>>> md...@apache.org> wrote:
>>> >> : > > >>>>>> >> > >
>>> >> : > > >>>>>> >> > > Hi Tim, I'm still working on SOLR-15555, the code
>>> and benchmarking
>>> >> : > > >>>>>> >> > > both look pretty good, but I've got a few last
>>> unit tests that I need
>>> >> : > > >>>>>> >> > > to chase down. Hopefully taken care of by today
>>> or tomorrow, I'll be
>>> >> : > > >>>>>> >> > > sure to keep you updated though.
>>> >> : > > >>>>>> >> > >
>>> >> : > > >>>>>> >> > >
>>> >> : > > >>>>>> >> > > On Thu, Sep 9, 2021 at 11:39 AM Timothy Potter <
>>> thelabd...@gmail.com> wrote:
>>> >> : > > >>>>>> >> > > >
>>> >> : > > >>>>>> >> > > > I found
>>> https://issues.apache.org/jira/browse/SOLR-15620 while testing
>>> >> : > > >>>>>> >> > > > the schema designer. I haven't built the RC
>>> yet, so going to see if I
>>> >> : > > >>>>>> >> > > > can get this in today.
>>> >> : > > >>>>>> >> > > >
>>> >> : > > >>>>>> >> > > > On Tue, Sep 7, 2021 at 12:36 PM Timothy Potter <
>>> thelabd...@apache.org> wrote:
>>> >> : > > >>>>>> >> > > > >
>>> >> : > > >>>>>> >> > > > > NOTICE:
>>> >> : > > >>>>>> >> > > > >
>>> >> : > > >>>>>> >> > > > > Branch branch_8_10 has been cut and versions
>>> updated to 8.11 on stable branch.
>>> >> : > > >>>>>> >> > > > >
>>> >> : > > >>>>>> >> > > > > Please observe the normal rules:
>>> >> : > > >>>>>> >> > > > >
>>> >> : > > >>>>>> >> > > > > * No new features may be committed to the
>>> branch.
>>> >> : > > >>>>>> >> > > > >
>>> >> : > > >>>>>> >> > > > > * Documentation patches, build patches and
>>> serious bug fixes may be
>>> >> : > > >>>>>> >> > > > >   committed to the branch. However, you
>>> should submit all patches you
>>> >> : > > >>>>>> >> > > > >   want to commit to Jira first to give others
>>> the chance to review
>>> >> : > > >>>>>> >> > > > >   and possibly vote against the patch. Keep
>>> in mind that it is our
>>> >> : > > >>>>>> >> > > > >   main intention to keep the branch as stable
>>> as possible.
>>> >> : > > >>>>>> >> > > > >
>>> >> : > > >>>>>> >> > > > > * All patches that are intended for the
>>> branch should first be committed
>>> >> : > > >>>>>> >> > > > >   to the unstable branch, merged into the
>>> stable branch, and then into
>>> >> : > > >>>>>> >> > > > >   the current release branch.
>>> >> : > > >>>>>> >> > > > >
>>> >> : > > >>>>>> >> > > > > * Normal unstable and stable branch
>>> development may continue as usual.
>>> >> : > > >>>>>> >> > > > >   However, if you plan to commit a big change
>>> to the unstable branch
>>> >> : > > >>>>>> >> > > > >   while the branch feature freeze is in
>>> effect, think twice: can't the
>>> >> : > > >>>>>> >> > > > >   addition wait a couple more days? Merges of
>>> bug fixes into the branch
>>> >> : > > >>>>>> >> > > > >   may become more difficult.
>>> >> : > > >>>>>> >> > > > >
>>> >> : > > >>>>>> >> > > > > * Only Jira issues with Fix version 8.10 and
>>> priority "Blocker" will delay
>>> >> : > > >>>>>> >> > > > >   a release candidate build.
>>> >> : > > >>>>>> >> > > > > ----
>>> >> : > > >>>>>> >> > > >
>>> >> : > > >>>>>> >> > > >
>>> ---------------------------------------------------------------------
>>> >> : > > >>>>>> >> > > > To unsubscribe, e-mail:
>>> dev-unsubscr...@lucene.apache.org
>>> >> : > > >>>>>> >> > > > For additional commands, e-mail:
>>> dev-h...@lucene.apache.org
>>> >> : > > >>>>>> >> > > >
>>> >> : > > >>>>>> >> > >
>>> >> : > > >>>>>> >> > >
>>> ---------------------------------------------------------------------
>>> >> : > > >>>>>> >> > > To unsubscribe, e-mail:
>>> dev-unsubscr...@lucene.apache.org
>>> >> : > > >>>>>> >> > > For additional commands, e-mail:
>>> dev-h...@lucene.apache.org
>>> >> : > > >>>>>> >> > >
>>> >> : > > >>>>>> >>
>>> >> : > > >>>>>> >>
>>> ---------------------------------------------------------------------
>>> >> : > > >>>>>> >> To unsubscribe, e-mail:
>>> dev-unsubscr...@solr.apache.org
>>> >> : > > >>>>>> >> For additional commands, e-mail:
>>> dev-h...@solr.apache.org
>>> >> : > > >>>>>> >>
>>> >> : > > >>>>>>
>>> >> : > > >>>>>>
>>> ---------------------------------------------------------------------
>>> >> : > > >>>>>> To unsubscribe, e-mail:
>>> dev-unsubscr...@lucene.apache.org
>>> >> : > > >>>>>> For additional commands, e-mail:
>>> dev-h...@lucene.apache.org
>>> >> : > > >>>>>>
>>> >> : > >
>>> >> : > >
>>> ---------------------------------------------------------------------
>>> >> : > > To unsubscribe, e-mail: dev-unsubscr...@solr.apache.org
>>> >> : > > For additional commands, e-mail: dev-h...@solr.apache.org
>>> >> : > >
>>> >> : >
>>> >> : >
>>> ---------------------------------------------------------------------
>>> >> : > To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
>>> >> : > For additional commands, e-mail: dev-h...@lucene.apache.org
>>> >> : >
>>> >> :
>>> >> :
>>> ---------------------------------------------------------------------
>>> >> : To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
>>> >> : For additional commands, e-mail: dev-h...@lucene.apache.org
>>> >> :
>>> >> :
>>> >>
>>> >> -Hoss
>>> >> http://www.lucidworks.com/
>>> >>
>>> >> ---------------------------------------------------------------------
>>> >> To unsubscribe, e-mail: dev-unsubscr...@solr.apache.org
>>> >> For additional commands, e-mail: dev-h...@solr.apache.org
>>> >
>>> >
>>>
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: dev-unsubscr...@solr.apache.org
>>> For additional commands, e-mail: dev-h...@solr.apache.org
>>>
>>>

Reply via email to