Re: Lucene 9.5.0 release
+1 from me. On Tue, Jan 17, 2023 at 11:32 AM Uwe Schindler wrote: > +1 > > Am 13.01.2023 um 10:54 schrieb Luca Cavanna: > > Hi all, > > I'd like to propose that we release Lucene 9.5.0. There is a decent > > amount of changes that would go into it looking at the github > > milestone: https://github.com/apache/lucene/milestone/4 . I'd > > volunteer to be the release manager. There is one PR open listed for > > the 9.5 milestone: https://github.com/apache/lucene/pull/11873 . Is > > this something that we do want to address before we release? Is > > anybody aware of outstanding work that we would like to include or > > known blocker issues that are not listed in the 9.5 milestone? > > > > Cheers > > Luca > > > > > > > -- > Uwe Schindler > Achterdiek 19, D-28357 Bremen > https://www.thetaphi.de > eMail: u...@thetaphi.de > > > - > To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org > For additional commands, e-mail: dev-h...@lucene.apache.org > >
Re: [lucene] branch main updated: More refactoring work, and fix a distance calculation.
Thanks, I was unable to get to this until this morning. The code was dead because the corresponding call hadn't been included. Fixed now. Karl On Thu, Nov 24, 2022 at 5:50 AM Adrien Grand wrote: > Karl, this commit has been failing precommit because it introduced > dead code. I just pushed a fix. > > > On Thu, Nov 24, 2022 at 10:47 AM wrote: > > > > This is an automated email from the ASF dual-hosted git repository. > > > > kwright pushed a commit to branch main > > in repository https://gitbox.apache.org/repos/asf/lucene.git > > > > > > The following commit(s) were added to refs/heads/main by this push: > > new 839dfb5a2dc More refactoring work, and fix a distance > calculation. > > 839dfb5a2dc is described below > > > > commit 839dfb5a2dc46c4b2d16d9db5ea9f31ca1e8d907 > > Author: Karl David Wright > > AuthorDate: Wed Nov 23 23:36:15 2022 -0500 > > > > More refactoring work, and fix a distance calculation. > > --- > > .../lucene/spatial3d/geom/GeoDegeneratePath.java | 32 ++--- > > .../lucene/spatial3d/geom/GeoStandardPath.java | 54 > -- > > .../apache/lucene/spatial3d/geom/TestGeoPath.java | 12 +++-- > > 3 files changed, 62 insertions(+), 36 deletions(-) > > > > diff --git > a/lucene/spatial3d/src/java/org/apache/lucene/spatial3d/geom/GeoDegeneratePath.java > b/lucene/spatial3d/src/java/org/apache/lucene/spatial3d/geom/GeoDegeneratePath.java > > index 524451ac68a..d1a452ca566 100644 > > --- > a/lucene/spatial3d/src/java/org/apache/lucene/spatial3d/geom/GeoDegeneratePath.java > > +++ > b/lucene/spatial3d/src/java/org/apache/lucene/spatial3d/geom/GeoDegeneratePath.java > > @@ -282,7 +282,7 @@ class GeoDegeneratePath extends GeoBasePath { > > minDistance = newDistance; > >} > > } > > -return minDistance; > > +return distanceStyle.fromAggregationForm(minDistance); > >} > > > >@Override > > @@ -468,6 +468,15 @@ class GeoDegeneratePath extends GeoBasePath { > >return this.point.isIdentical(x, y, z); > > } > > > > +public boolean isWithinSection(final double x, final double y, > final double z) { > > + for (final Membership cutoffPlane : cutoffPlanes) { > > +if (!cutoffPlane.isWithin(x, y, z)) { > > + return false; > > +} > > + } > > + return true; > > +} > > + > > /** > > * Compute interior path distance. > > * > > @@ -502,7 +511,7 @@ class GeoDegeneratePath extends GeoBasePath { > >return Double.POSITIVE_INFINITY; > > } > >} > > - return distanceStyle.computeDistance(this.point, x, y, z); > > + return > distanceStyle.toAggregationForm(distanceStyle.computeDistance(this.point, > x, y, z)); > > } > > > > /** > > @@ -516,7 +525,7 @@ class GeoDegeneratePath extends GeoBasePath { > > */ > > public double outsideDistance( > > final DistanceStyle distanceStyle, final double x, final double > y, final double z) { > > - return distanceStyle.computeDistance(this.point, x, y, z); > > + return > distanceStyle.toAggregationForm(distanceStyle.computeDistance(this.point, > x, y, z)); > > } > > > > /** > > @@ -578,7 +587,7 @@ class GeoDegeneratePath extends GeoBasePath { > > > > @Override > > public String toString() { > > - return point.toString(); > > + return "SegmentEndpoint: " + point; > > } > >} > > > > @@ -659,6 +668,10 @@ class GeoDegeneratePath extends GeoBasePath { > >&& normalizedConnectingPlane.evaluateIsZero(x, y, z); > > } > > > > +public boolean isWithinSection(final double x, final double y, > final double z) { > > + return startCutoffPlane.isWithin(x, y, z) && > endCutoffPlane.isWithin(x, y, z); > > +} > > + > > /** > > * Compute path center distance (distance from path to current > point). > > * > > @@ -671,7 +684,7 @@ class GeoDegeneratePath extends GeoBasePath { > > public double pathCenterDistance( > > final DistanceStyle distanceStyle, final double x, final double > y, final double z) { > >// First, if this point is outside the endplanes of the segment, > return POSITIVE_INFINITY. > > - if (!startCutoffPlane.isWithin(x, y, z) || > !endCutoffPlane.isWithin(x, y, z)) { > > + if (!isWithinSection(x, y, z)) { > > return Double.POSITIVE_INFINITY; > >} > >// (1) Compute normalizedPerpPlane. If degenerate, then there is > no such plane, which means > > @@ -710,7 +723,7 @@ class GeoDegeneratePath extends GeoBasePath { > >"Can't find world intersection for point x=" + x + " y=" > + y + " z=" + z); > > } > >} > > - return distanceStyle.computeDistance(thePoint, x, y, z); > > + return > distanceStyle.toAggregationForm(distanceStyle.computeDistance(thePoint, x, > y, z)); > > } > > > > /** > > @@ -726,7 +739,7 @@ class GeoDegeneratePath extends GeoBasePath { > > public
Re: [lucene] branch main updated: Prevent NPEs while still handling the polar case for horizontal planes right off the pole
My entire tool set and work environment is inside WSL. I've determined that the issue for me is the performance of the file system. I had to remove the (bundled) antivirus software to get even where I am now. But I have no evidence that even doing windows-native operations with this disk are fast. I suspect that even though this is an SSD it's not a very fast one. It did get twice as fast when I turned off the new Windows 11 "climate change" feature, which apparently conserves energy by throttling the hell out of everything, including disk access. So maybe this is still being throttled to some degree and I have to figure out where. Karl On Thu, Nov 24, 2022 at 3:23 AM Jan Høydahl wrote: > I’m not on Windows myself, but I think the trick is doing the git clone to > the WSL file system. So you may have one checkout for use with windows and > another for use within wsl. > > And if you’re a CLI person, there’s a GitHub cli tool ‘hub’ that is handy: > https://hub.github.com/ > > Jan Høydahl > > 17. nov. 2022 kl. 16:49 skrev Dawid Weiss : > > I never used WSL but it does seem like the problem there: > > "As you can tell from the comparison table above, the WSL 2 > architecture outperforms WSL 1 in several ways, with the exception of > performance across OS file systems, which can be addressed by storing > your project files on the same operating system as the tools you are > running to work on the project." > > https://learn.microsoft.com/en-us/windows/wsl/compare-versions > > Dawid > > > On Thu, Nov 17, 2022 at 1:11 PM Robert Muir wrote: > > > if your machine is really 12 cores and 64GB ram but is that slow, then > > uninstall that windows shit immediately, that's horrible. > > > On Thu, Nov 17, 2022 at 5:46 AM Karl Wright wrote: > > > Thanks - the target I was using was the complete "build" target on the > whole project. This will be a valuable improvement. ;-) > > > I have slow network here so it is possible that the entire build was slow > for that reason. The machine is a new Dell laptop, 12 cores, 64GB memory, > but I am running under Windows Subsystem for Linux which is a bit slower > than native Ubuntu. Still, the gradlew command you gave takes many minutes > (of which a sizable amount is spent in :gitStatus - more than 5 minutes > there alone). Anything less than 10 minutes I deem acceptable, which this > doesn't quite manage, but I'll live. > > > Karl > > > > On Thu, Nov 17, 2022 at 5:06 AM Dawid Weiss wrote: > > > > Thank you for the comment. > > > > Sorry if it came out the wrong way - I certainly didn't mean it to be > unkind. > > > > It took me several days just to get things set up so I was able to commit > again, and I did this through command-line not github. > > > > These things are not mutually exclusive - I work with command line as > well. You just push to your own repository (or a branch, if you don't care > to have your own fork on github) and then file a PR from there. If you're > on a slower machine - this is even better since precommit checks run for > you there. > > > > The full gradlew script takes over 2 hours to run now so if there's a > faster target I can use to determine these things in advance I'd love to > know what it is. > > > > Well, this is crazy long so I wonder what's happening. I'd love to help > but it'd be good to know what machine this is (disk, cpu, memory?) and what > the build command was. Without knowing these, I'd say - run the tests and > checks for the module you've changed only, not for everything. How long > does this take? > > > ./gradlew check -p lucene/spatial3d > > > It takes roughly 1 minute for me, including startup (after the daemon is > running in the background, it's much faster). > > > There are some workflow examples/ hints I left here: > > https://github.com/apache/lucene/blob/main/help/workflow.txt#L6-L22 > > > Hope it helps, > > Dawid > > > - > > To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org > > For additional commands, e-mail: dev-h...@lucene.apache.org > > > > - > To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org > For additional commands, e-mail: dev-h...@lucene.apache.org > >
Re: [lucene] 02/03: Fix longstanding bug in path bounds calculation, and hook up efficient isWithin() and distance logic
These are randomized test failures occurring because a getBounds() operation is apparently not always returning the right thing and the new code depends on it being right. I can make a commit that disables this logic if it's annoying but the failures have been there all along. But now the code is more sensitive to them. I already fixed another such issue with getBounds() for GeoPaths but there's apparently more I didn't get before committing. If you need a temporary commit to make the random tests always pass while I diagnose and fix please let me know. Karl On Sun, Nov 20, 2022 at 1:49 AM Karl Wright wrote: > I'm looking at it. > Karl > > > On Sat, Nov 19, 2022 at 11:41 PM Robert Muir wrote: > >> Multiple spatial tests are failing in jenkins... bisected them to this >> commit. >> >> Can you please look into it? >> https://github.com/apache/lucene/issues/11956 >> >> On Sat, Nov 19, 2022 at 8:22 PM wrote: >> > >> > This is an automated email from the ASF dual-hosted git repository. >> > >> > kwright pushed a commit to branch main >> > in repository https://gitbox.apache.org/repos/asf/lucene.git >> > >> > commit 9bca7a70e10db81b39a5afb4498aab1006402031 >> > Author: Karl David Wright >> > AuthorDate: Sat Nov 19 17:35:30 2022 -0500 >> > >> > Fix longstanding bug in path bounds calculation, and hook up >> efficient isWithin() and distance logic >> > --- >> > .../geom/{GeoBaseShape.java => GeoBaseBounds.java} | 6 +- >> > .../apache/lucene/spatial3d/geom/GeoBaseShape.java | 24 +- >> > .../apache/lucene/spatial3d/geom/GeoBounds.java| 27 ++ >> > .../org/apache/lucene/spatial3d/geom/GeoShape.java | 2 +- >> > .../lucene/spatial3d/geom/GeoStandardPath.java | 277 >> - >> > 5 files changed, 140 insertions(+), 196 deletions(-) >> > >> > diff --git >> a/lucene/spatial3d/src/java/org/apache/lucene/spatial3d/geom/GeoBaseShape.java >> b/lucene/spatial3d/src/java/org/apache/lucene/spatial3d/geom/GeoBaseBounds.java >> > similarity index 90% >> > copy from >> lucene/spatial3d/src/java/org/apache/lucene/spatial3d/geom/GeoBaseShape.java >> > copy to >> lucene/spatial3d/src/java/org/apache/lucene/spatial3d/geom/GeoBaseBounds.java >> > index a5992392563..52030b333d3 100644 >> > --- >> a/lucene/spatial3d/src/java/org/apache/lucene/spatial3d/geom/GeoBaseShape.java >> > +++ >> b/lucene/spatial3d/src/java/org/apache/lucene/spatial3d/geom/GeoBaseBounds.java >> > @@ -17,18 +17,18 @@ >> > package org.apache.lucene.spatial3d.geom; >> > >> > /** >> > - * Base extended shape object. >> > + * Base object that supports bounds operations. >> > * >> > * @lucene.internal >> > */ >> > -public abstract class GeoBaseShape extends BasePlanetObject implements >> GeoShape { >> > +public abstract class GeoBaseBounds extends BasePlanetObject >> implements GeoBounds { >> > >> >/** >> > * Constructor. >> > * >> > * @param planetModel is the planet model to use. >> > */ >> > - public GeoBaseShape(final PlanetModel planetModel) { >> > + public GeoBaseBounds(final PlanetModel planetModel) { >> > super(planetModel); >> >} >> > >> > diff --git >> a/lucene/spatial3d/src/java/org/apache/lucene/spatial3d/geom/GeoBaseShape.java >> b/lucene/spatial3d/src/java/org/apache/lucene/spatial3d/geom/GeoBaseShape.java >> > index a5992392563..a4b5cd18a62 100644 >> > --- >> a/lucene/spatial3d/src/java/org/apache/lucene/spatial3d/geom/GeoBaseShape.java >> > +++ >> b/lucene/spatial3d/src/java/org/apache/lucene/spatial3d/geom/GeoBaseShape.java >> > @@ -21,7 +21,7 @@ package org.apache.lucene.spatial3d.geom; >> > * >> > * @lucene.internal >> > */ >> > -public abstract class GeoBaseShape extends BasePlanetObject implements >> GeoShape { >> > +public abstract class GeoBaseShape extends GeoBaseBounds implements >> GeoShape { >> > >> >/** >> > * Constructor. >> > @@ -31,26 +31,4 @@ public abstract class GeoBaseShape extends >> BasePlanetObject implements GeoShape >> >public GeoBaseShape(final PlanetModel planetModel) { >> > super(planetModel); >> >} >> > - >> > - @Override >> > - public void getBounds(Bounds bounds) { >> > -if (isWithin(planetModel.NORTH_POLE)) { >> > -
Re: [lucene] 02/03: Fix longstanding bug in path bounds calculation, and hook up efficient isWithin() and distance logic
I'm looking at it. Karl On Sat, Nov 19, 2022 at 11:41 PM Robert Muir wrote: > Multiple spatial tests are failing in jenkins... bisected them to this > commit. > > Can you please look into it? https://github.com/apache/lucene/issues/11956 > > On Sat, Nov 19, 2022 at 8:22 PM wrote: > > > > This is an automated email from the ASF dual-hosted git repository. > > > > kwright pushed a commit to branch main > > in repository https://gitbox.apache.org/repos/asf/lucene.git > > > > commit 9bca7a70e10db81b39a5afb4498aab1006402031 > > Author: Karl David Wright > > AuthorDate: Sat Nov 19 17:35:30 2022 -0500 > > > > Fix longstanding bug in path bounds calculation, and hook up > efficient isWithin() and distance logic > > --- > > .../geom/{GeoBaseShape.java => GeoBaseBounds.java} | 6 +- > > .../apache/lucene/spatial3d/geom/GeoBaseShape.java | 24 +- > > .../apache/lucene/spatial3d/geom/GeoBounds.java| 27 ++ > > .../org/apache/lucene/spatial3d/geom/GeoShape.java | 2 +- > > .../lucene/spatial3d/geom/GeoStandardPath.java | 277 > - > > 5 files changed, 140 insertions(+), 196 deletions(-) > > > > diff --git > a/lucene/spatial3d/src/java/org/apache/lucene/spatial3d/geom/GeoBaseShape.java > b/lucene/spatial3d/src/java/org/apache/lucene/spatial3d/geom/GeoBaseBounds.java > > similarity index 90% > > copy from > lucene/spatial3d/src/java/org/apache/lucene/spatial3d/geom/GeoBaseShape.java > > copy to > lucene/spatial3d/src/java/org/apache/lucene/spatial3d/geom/GeoBaseBounds.java > > index a5992392563..52030b333d3 100644 > > --- > a/lucene/spatial3d/src/java/org/apache/lucene/spatial3d/geom/GeoBaseShape.java > > +++ > b/lucene/spatial3d/src/java/org/apache/lucene/spatial3d/geom/GeoBaseBounds.java > > @@ -17,18 +17,18 @@ > > package org.apache.lucene.spatial3d.geom; > > > > /** > > - * Base extended shape object. > > + * Base object that supports bounds operations. > > * > > * @lucene.internal > > */ > > -public abstract class GeoBaseShape extends BasePlanetObject implements > GeoShape { > > +public abstract class GeoBaseBounds extends BasePlanetObject implements > GeoBounds { > > > >/** > > * Constructor. > > * > > * @param planetModel is the planet model to use. > > */ > > - public GeoBaseShape(final PlanetModel planetModel) { > > + public GeoBaseBounds(final PlanetModel planetModel) { > > super(planetModel); > >} > > > > diff --git > a/lucene/spatial3d/src/java/org/apache/lucene/spatial3d/geom/GeoBaseShape.java > b/lucene/spatial3d/src/java/org/apache/lucene/spatial3d/geom/GeoBaseShape.java > > index a5992392563..a4b5cd18a62 100644 > > --- > a/lucene/spatial3d/src/java/org/apache/lucene/spatial3d/geom/GeoBaseShape.java > > +++ > b/lucene/spatial3d/src/java/org/apache/lucene/spatial3d/geom/GeoBaseShape.java > > @@ -21,7 +21,7 @@ package org.apache.lucene.spatial3d.geom; > > * > > * @lucene.internal > > */ > > -public abstract class GeoBaseShape extends BasePlanetObject implements > GeoShape { > > +public abstract class GeoBaseShape extends GeoBaseBounds implements > GeoShape { > > > >/** > > * Constructor. > > @@ -31,26 +31,4 @@ public abstract class GeoBaseShape extends > BasePlanetObject implements GeoShape > >public GeoBaseShape(final PlanetModel planetModel) { > > super(planetModel); > >} > > - > > - @Override > > - public void getBounds(Bounds bounds) { > > -if (isWithin(planetModel.NORTH_POLE)) { > > - > bounds.noTopLatitudeBound().noLongitudeBound().addPoint(planetModel.NORTH_POLE); > > -} > > -if (isWithin(planetModel.SOUTH_POLE)) { > > - > bounds.noBottomLatitudeBound().noLongitudeBound().addPoint(planetModel.SOUTH_POLE); > > -} > > -if (isWithin(planetModel.MIN_X_POLE)) { > > - bounds.addPoint(planetModel.MIN_X_POLE); > > -} > > -if (isWithin(planetModel.MAX_X_POLE)) { > > - bounds.addPoint(planetModel.MAX_X_POLE); > > -} > > -if (isWithin(planetModel.MIN_Y_POLE)) { > > - bounds.addPoint(planetModel.MIN_Y_POLE); > > -} > > -if (isWithin(planetModel.MAX_Y_POLE)) { > > - bounds.addPoint(planetModel.MAX_Y_POLE); > > -} > > - } > > } > > diff --git > a/lucene/spatial3d/src/java/org/apache/lucene/spatial3d/geom/GeoBounds.java > b/lucene/spatial3d/src/java/org/apache/lucene/spatial3d/geom/GeoBounds.java > > new file mode 100644 > > index 000..935366c5a08 > > --- /dev/null > > +++ > b/lucene/spatial3d/src/java/org/apache/lucene/spatial3d/geom/GeoBounds.java > > @@ -0,0 +1,27 @@ > > +/* > > + * Licensed to the Apache Software Foundation (ASF) under one or more > > + * contributor license agreements. See the NOTICE file distributed with > > + * this work for additional information regarding copyright ownership. > > + * The ASF licenses this file to You under the Apache License, Version > 2.0 > > + * (the "License"); you may not use this file except in compliance with > > + * the License. You may obtain a copy of the License at > > + * >
Re: [lucene] branch main updated: Prevent NPEs while still handling the polar case for horizontal planes right off the pole
Thanks - the target I was using was the complete "build" target on the whole project. This will be a valuable improvement. ;-) I have slow network here so it is possible that the entire build was slow for that reason. The machine is a new Dell laptop, 12 cores, 64GB memory, but I am running under Windows Subsystem for Linux which is a bit slower than native Ubuntu. Still, the gradlew command you gave takes many minutes (of which a sizable amount is spent in :gitStatus - more than 5 minutes there alone). Anything less than 10 minutes I deem acceptable, which this doesn't quite manage, but I'll live. Karl On Thu, Nov 17, 2022 at 5:06 AM Dawid Weiss wrote: > > Thank you for the comment. >> > > Sorry if it came out the wrong way - I certainly didn't mean it to be > unkind. > > >> It took me several days just to get things set up so I was able to commit >> again, and I did this through command-line not github. >> > > These things are not mutually exclusive - I work with command line as > well. You just push to your own repository (or a branch, if you don't care > to have your own fork on github) and then file a PR from there. If you're > on a slower machine - this is even better since precommit checks run for > you there. > > >> The full gradlew script takes over 2 hours to run now so if there's a >> faster target I can use to determine these things in advance I'd love to >> know what it is. >> > > Well, this is crazy long so I wonder what's happening. I'd love to help > but it'd be good to know what machine this is (disk, cpu, memory?) and what > the build command was. Without knowing these, I'd say - run the tests and > checks for the module you've changed only, not for everything. How long > does this take? > > ./gradlew check -p lucene/spatial3d > > It takes roughly 1 minute for me, including startup (after the daemon is > running in the background, it's much faster). > > There are some workflow examples/ hints I left here: > https://github.com/apache/lucene/blob/main/help/workflow.txt#L6-L22 > > Hope it helps, > Dawid >
Re: [lucene] branch main updated: Prevent NPEs while still handling the polar case for horizontal planes right off the pole
Thank you for the comment. It took me several days just to get things set up so I was able to commit again, and I did this through command-line not github. The full gradlew script takes over 2 hours to run now so if there's a faster target I can use to determine these things in advance I'd love to know what it is. Karl On Thu, Nov 17, 2022 at 1:23 AM Dawid Weiss wrote: > > Hi Karl, > > This commit broke the build because code formatting was off (this was > fixed in a subsequent, unrelated commit). > > I spent some time looking for the issue to check what happened and > couldn't find it anywhere. Github's PR infrastructure > makes it quite convenient to ensure everything passes before it's merged > and it leaves a handy > place to add comments in case something doesn't work - I highly recommend > it. > > Dawid > > On Thu, Nov 17, 2022 at 2:19 AM wrote: > >> This is an automated email from the ASF dual-hosted git repository. >> >> kwright pushed a commit to branch main >> in repository https://gitbox.apache.org/repos/asf/lucene.git >> >> >> The following commit(s) were added to refs/heads/main by this push: >> new b6ebfd18610 Prevent NPEs while still handling the polar case for >> horizontal planes right off the pole >> b6ebfd18610 is described below >> >> commit b6ebfd18610c482109c6a38b2327254848508f03 >> Author: Karl David Wright >> AuthorDate: Wed Nov 16 11:03:24 2022 -0500 >> >> Prevent NPEs while still handling the polar case for horizontal >> planes right off the pole >> --- >> .../java/org/apache/lucene/spatial3d/geom/Plane.java | 20 >> >> 1 file changed, 16 insertions(+), 4 deletions(-) >> >> diff --git >> a/lucene/spatial3d/src/java/org/apache/lucene/spatial3d/geom/Plane.java >> b/lucene/spatial3d/src/java/org/apache/lucene/spatial3d/geom/Plane.java >> index ef9e9773223..9b46c3553bf 100755 >> --- >> a/lucene/spatial3d/src/java/org/apache/lucene/spatial3d/geom/Plane.java >> +++ >> b/lucene/spatial3d/src/java/org/apache/lucene/spatial3d/geom/Plane.java >> @@ -1500,9 +1500,14 @@ public class Plane extends Vector { >>} else { >> // Since a==b==0, any plane including the Z axis suffices. >> // System.err.println(" Perpendicular to z"); >> -final GeoPoint[] points = >> +GeoPoint[] points = >> findIntersections(planetModel, normalYPlane, NO_BOUNDS, >> NO_BOUNDS); >> -if (points.length > 0) { >> +if (points.length == 0) { >> + points = findIntersections(planetModel, normalXPlane, >> NO_BOUNDS, NO_BOUNDS); >> +} >> +if (points.length == 0) { >> + boundsInfo.addZValue(new GeoPoint(0.0, 0.0, -this.z)); >> +} else { >>boundsInfo.addZValue(points[0]); >> } >>} >> @@ -2042,9 +2047,16 @@ public class Plane extends Vector { >> } >>} else { >> // Horizontal circle. Since a==b, any vertical plane suffices. >> -final GeoPoint[] points = >> +GeoPoint[] points = >> findIntersections(planetModel, normalXPlane, NO_BOUNDS, >> NO_BOUNDS); >> -boundsInfo.addZValue(points[0]); >> +if (points.length == 0) { >> + points = findIntersections(planetModel, normalYPlane, >> NO_BOUNDS, NO_BOUNDS); >> +} >> +if (points.length == 0) { >> + boundsInfo.addZValue(new GeoPoint(0.0, 0.0, -this.z)); >> +} else { >> + boundsInfo.addZValue(points[0]); >> +} >>} >>// System.err.println("Done latitude bounds"); >> } >> >>
Re: Congratulations to the new Lucene PMC Chair, Michael Sokolov!
Congratulations! On Sat, Feb 20, 2021 at 4:17 PM Namgyu Kim wrote: > Congratulations, Mike! :D > > On Thu, Feb 18, 2021 at 6:32 AM Anshum Gupta > wrote: > >> Every year, the Lucene PMC rotates the Lucene PMC chair and Apache Vice >> President position. >> >> This year we nominated and elected Michael Sokolov as the Chair, a >> decision that the board approved in its February 2021 meeting. >> >> Congratulations, Mike! >> >> -- >> Anshum Gupta >> >
Re: Congratulations to the new Apache Solr PMC Chair, Jan Høydahl!
Congratulations! Karl On Sat, Feb 20, 2021 at 6:28 AM Uwe Schindler wrote: > Congrats Jan! > > > > Uwe > > > > - > > Uwe Schindler > > Achterdiek 19, D-28357 Bremen > > https://www.thetaphi.de > > eMail: u...@thetaphi.de > > > > *From:* Anshum Gupta > *Sent:* Thursday, February 18, 2021 7:55 PM > *To:* Lucene Dev ; solr-u...@lucene.apache.org > *Subject:* Congratulations to the new Apache Solr PMC Chair, Jan Høydahl! > > > > Hi everyone, > > > > I’d like to inform everyone that the newly formed Apache Solr PMC > nominated and elected Jan Høydahl for the position of the Solr PMC Chair > and Vice President. This decision was approved by the board in its February > 2021 meeting. > > > > Congratulations Jan! > > > > -- > > Anshum Gupta >
Re: [VOTE] Solr to become a top-level Apache project (TLP)
+1 from me (binding) Karl On Tue, May 12, 2020 at 3:54 AM Atri Sharma wrote: > +1 (binding). > > Regards, > > Atri > > On Tue, 12 May 2020 at 13:07, Dawid Weiss wrote: > >> Dear Lucene and Solr developers! >> >> According to an earlier [DISCUSS] thread on the dev list [2], I am >> calling for a vote on the proposal to make Solr a top-level Apache >> project (TLP) and separate Lucene and Solr development into two >> independent entities. >> >> To quickly recap the reasons and consequences of such a move: it seems >> like the reasons for the initial merge of Lucene and Solr, around 10 >> years ago, have been achieved. Both projects are in good shape and >> exhibit signs of independence already (mailing lists, committers, >> patch flow). There are many technical considerations that would make >> development much easier if we move Solr out into its own TLP. >> >> We discussed this issue [2] and both PMC members and committers had a >> chance to review all the pros and cons and express their views. The >> discussion showed that there are clearly different opinions on the >> matter - some people are in favor, some are neutral, others are >> against or not seeing the point of additional labor. Realistically, I >> don't think reaching 100% level consensus is going to be possible -- >> we are a diverse bunch with different opinions and personalities. I >> firmly believe this is the right direction hence the decision to put >> it under the voting process. Should something take a wrong turn in the >> future (as some folks worry it may), all blame is on me. >> >> Therefore, the proposal is to separate Solr from under Lucene TLP, and >> make it a TLP on its own. The initial structure of the new PMC, >> committer base, git repositories and other managerial aspects can be >> worked out during the process if the decision passes. >> >> Please indicate one of the following (see [1] for guidelines): >> >> [ ] +1 - yes, I vote for the proposal >> [ ] -1 - no, I vote against the proposal >> >> Please note that anyone in the Lucene+Solr community is invited to >> express their opinion, though only Lucene+Solr committers cast binding >> votes (indicate non-binding votes in your reply, please). >> >> The vote will be active for a week to give everyone a chance to read >> and cast a vote. >> >> Dawid >> >> [1] https://www.apache.org/foundation/voting.html >> [2] >> https://lists.apache.org/thread.html/rfae2440264f6f874e91545b2030c98e7b7e3854ddf090f7747d338df%40%3Cdev.lucene.apache.org%3E >> >> - >> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org >> For additional commands, e-mail: dev-h...@lucene.apache.org >> >> -- > Regards, > > Atri > Apache Concerted >
Re: Welcome Eric Pugh as a Lucene/Solr committer
Welcome, Eric! On Mon, Apr 6, 2020 at 9:52 AM Steve Rowe wrote: > Congrats and welcome Eric! > > -- > Steve > > > On Apr 6, 2020, at 8:21 AM, Jan Høydahl wrote: > > > > Hi all, > > > > Please join me in welcoming Eric Pugh as the latest Lucene/Solr > committer! > > > > Eric has been part of the Solr community for over a decade, as a code > contributor, book author, company founder, blogger and mailing list > contributor! We look forward to his future contributions! > > > > Congratulations and welcome! It is a tradition to introduce yourself > with a brief bio, Eric. > > > > Jan Høydahl > > - > > To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org > > For additional commands, e-mail: dev-h...@lucene.apache.org > > > > > - > To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org > For additional commands, e-mail: dev-h...@lucene.apache.org > >
Re: Welcome Alessandro Benedetti as a Lucene/Solr committer
Welcome, Alessandro! Karl On Wed, Mar 18, 2020 at 9:01 AM David Smiley wrote: > Hi all, > > Please join me in welcoming Alessandro Benedetti as the latest Lucene/Solr > committer! > > Alessandro has been contributing to Lucene and Solr in areas such as More > Like This, Synonym boosting, and Suggesters, and other areas for years. > Furthermore he's been a help to many users on the solr-user mailing list > and has helped others through his blog posts and presentations about > search. We look forward to his future contributions. > > Congratulations and welcome! It is a tradition to introduce yourself with > a brief bio, Alessandro. > > ~ David Smiley > Apache Lucene/Solr Search Developer > http://www.linkedin.com/in/davidwsmiley >
Re: Congratulations to the new Lucene/Solr PMC Chair, Anshum Gupta!
Congratulations!! Karl On Fri, Jan 17, 2020 at 6:37 AM Namgyu Kim wrote: > Congratulations Anshum! :D > > On Fri, Jan 17, 2020 at 7:32 PM Ignacio Vera wrote: > >> Congrats Anshum! >> >> On Fri, Jan 17, 2020 at 3:17 AM Shalin Shekhar Mangar < >> shalinman...@gmail.com> wrote: >> >>> Congratulations Anshum! >>> >>> On Thu, Jan 16, 2020 at 2:45 AM Cassandra Targett >>> wrote: >>> Every year, the Lucene PMC rotates the Lucene PMC chair and Apache Vice President position. This year we have nominated and elected Anshum Gupta as the Chair, a decision that the board approved in its January 2020 meeting. Congratulations, Anshum! Cassandra >>> >>> -- >>> Regards, >>> Shalin Shekhar Mangar. >>> >>
Re: Welcome Houston Putman as Lucene/Solr committer
Welcome! Karl On Thu, Nov 14, 2019 at 8:17 AM Michael Sokolov wrote: > Hi Houston, welcome! > > On Thu, Nov 14, 2019 at 7:23 AM Erick Erickson > wrote: > > > > Welcome! > > > > > On Nov 14, 2019, at 5:19 AM, Jan Høydahl > wrote: > > > > > > Congrats and welcome Houston! > > > > > > -- > > > Jan Høydahl, search solution architect > > > Cominvent AS - www.cominvent.com > > > > > >> 14. nov. 2019 kl. 09:57 skrev Anshum Gupta : > > >> > > >> Hi all, > > >> > > >> Please join me in welcoming Houston Putman as the latest Lucene/Solr > committer! > > >> > > >> Houston has been involved with the community since 2013, when he > first contributed the Analytics contrib module. Since then he has been > involved with the community, participated in conferences and spoken about > his work with Lucene/Solr. In the recent past, he has been involved with > getting Solr to scale on Kubernetes. > > >> > > >> Looking forward to your commits to the Apache Lucene/Solr project :) > > >> > > >> Congratulations and welcome, Houston! It's a tradition to introduce > yourself with a brief bio. > > >> > > >> -- > > >> Anshum Gupta > > > > > > > > > - > > To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org > > For additional commands, e-mail: dev-h...@lucene.apache.org > > > > - > To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org > For additional commands, e-mail: dev-h...@lucene.apache.org > >
Re: Welcome Atri Sharma as Lucene/Solr committer
Welcome! Karl On Wed, Sep 18, 2019 at 11:38 AM Namgyu Kim wrote: > Congratulations and welcome, Atri! XD > > 2019년 9월 19일 (목) 오전 12:24, Gus Heck 님이 작성: > >> Welcome :) >> >> On Wed, Sep 18, 2019 at 11:21 AM Kevin Risden wrote: >> >>> Congrats and welcome Atri! >>> >>> Kevin Risden >>> >>> >>> On Wed, Sep 18, 2019 at 11:05 AM Yonik Seeley wrote: >>> Congrats Atri! -Yonik On Wed, Sep 18, 2019 at 3:12 AM Adrien Grand wrote: > Hi all, > > Please join me in welcoming Atri Sharma as Lucene/ Solr committer! > > If you are following activity on Lucene, this name will likely sound > familiar to you: Atri has been very busy trying to improve Lucene over > the past months. In particular, Atri recently started improving our > top-hits optimizations like early termination on sorted indexes and > WAND, when indexes are searched using multiple threads. > > Congratulations and welcome! It is a tradition to introduce yourself > with a brief bio. > > -- > Adrien > > - > To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org > For additional commands, e-mail: dev-h...@lucene.apache.org > > >> >> -- >> http://www.needhamsoftware.com (work) >> http://www.the111shift.com (play) >> >
Re: Welcome Michael Sokolov as Lucene/ Solr committer
Welcome, Michael! Karl On Mon, May 13, 2019 at 4:52 PM Martin Gainty wrote: > Удачи Майкл! > > > -- > *From:* Erick Erickson > *Sent:* Monday, May 13, 2019 4:11 PM > *To:* dev@lucene.apache.org > *Subject:* Re: Welcome Michael Sokolov as Lucene/ Solr committer > > Welcome Michael! > > > On May 13, 2019, at 2:48 PM, Dawid Weiss wrote: > > > >> I am pretty sure my first interaction with the Apache Solr/Lucene > community was back in 2012, > > > > Yeah... I really don't know how it happened you haven't been > > invited earlier. Everyone just kind of assumed you > > have committer rights already! :) > > > > > > D. > > > > On Mon, May 13, 2019 at 9:23 PM Michael Sokolov > wrote: > >> > >> Thanks Dawid, and thank you to everyone who voted to grant me access > >> to this awesome project! > >> > >> I spent many years building full text search web applications serving > >> large texts (especially dictionaries, encyclopedias, and academic > >> journals). I cut my teeth with AltaVista back in 1998, and tried many > >> other search engines before finally coming around to Solr/Lucene. > >> > >> I am pretty sure my first interaction with the Apache Solr/Lucene > >> community was back in 2012, when I was looking to solve a performance > >> problem we encountered highlighting gigantic documents. Since then > >> I've worked on many projects involving Solr and Lucene, and > >> ElasticSearch, and made various contributions, implemented some of my > >> own extensions, made a separate XML query engine based on Solr (Lux - > >> no longer active), went to a few Lucene/Solr Revolutions (spoke at > >> one), and always in the back of my mind was the idea of contributing > >> more actively and becoming a full participant in this thriving open > >> source project. Now I'm really excited that has come to pass, and look > >> forward to digging in even deeper, and helping to keep this thing > >> going. > >> > >> -Mike > >> > >> On Mon, May 13, 2019 at 3:12 PM Dawid Weiss > wrote: > >>> > >>> Hello everyone, > >>> > >>> Please join me in welcoming Michael Sokolov as Lucene/ Solr committer! > >>> > >>> Many of you probably know Mike as he's been around for quite a while > >>> -- answering questions, reviewing patches, providing insight and > >>> actively working on new code. > >>> > >>> Congratulations and welcome! It is a tradition to introduce yourself > >>> with a brief bio, Mike. > >>> > >>> Dawid > > > > - > > To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org > > For additional commands, e-mail: dev-h...@lucene.apache.org > > > > > - > To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org > For additional commands, e-mail: dev-h...@lucene.apache.org > >
Re: Welcome Tomoko Uchida as Lucene/Solr committer
Welcome! Karl On Mon, Apr 8, 2019 at 8:28 PM Christian Moen wrote: > Congratulations, Tomoko-san! > > On Tue, Apr 9, 2019 at 12:20 AM Uwe Schindler wrote: > >> Hi all, >> >> Please join me in welcoming Tomoko Uchida as the latest Lucene/Solr >> committer! >> >> She has been working on https://issues.apache.org/jira/browse/LUCENE-2562 >> for several years with awesome progress and finally we got the fantastic >> Luke as a branch on ASF JIRA: >> https://gitbox.apache.org/repos/asf?p=lucene-solr.git;a=shortlog;h=refs/heads/jira/lucene-2562-luke-swing-3 >> Looking forward to the first release of Apache Lucene 8.1 with Luke >> bundled in the distribution. I will take care of merging it to master and >> 8.x branches together with her once she got the ASF account. >> >> Tomoko also helped with the Japanese and Korean Analyzers. >> >> Congratulations and Welcome, Tomoko! Tomoko, it's traditional for you to >> introduce yourself with a brief bio. >> >> Uwe & Robert (who nominated Tomoko) >> >> - >> Uwe Schindler >> Achterdiek 19, D-28357 Bremen >> https://www.thetaphi.de >> eMail: u...@thetaphi.de >> >> >> >> - >> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org >> For additional commands, e-mail: dev-h...@lucene.apache.org >> >>
Re: Welcome Ignacio Vera to the PMC
Welcome, Ignacio! Karl On Mon, Mar 4, 2019 at 4:51 AM Alan Woodward wrote: > Congratulations and welcome, Ignacio! > > > On 4 Mar 2019, at 09:09, Adrien Grand wrote: > > > > I am pleased to announce that Ignacio Vera has accepted the PMC's > > invitation to join. > > > > Welcome Ignacio! > > > > -- > > Adrien > > > > - > > To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org > > For additional commands, e-mail: dev-h...@lucene.apache.org > > > > > - > To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org > For additional commands, e-mail: dev-h...@lucene.apache.org > >
[jira] [Resolved] (LUCENE-8696) TestGeo3DPoint.testGeo3DRelations failure
[ https://issues.apache.org/jira/browse/LUCENE-8696?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karl Wright resolved LUCENE-8696. - Resolution: Fixed Fix Version/s: 7.7.2 master (9.0) 8.x > TestGeo3DPoint.testGeo3DRelations failure > - > > Key: LUCENE-8696 > URL: https://issues.apache.org/jira/browse/LUCENE-8696 > Project: Lucene - Core > Issue Type: Bug > Components: modules/spatial3d >Reporter: Ignacio Vera > Assignee: Karl Wright >Priority: Major > Fix For: 8.x, master (9.0), 7.7.2 > > Attachments: LUCENE-8696.patch > > > Reproduce with: > {code:java} > ant test -Dtestcase=TestGeo3DPoint -Dtests.method=testGeo3DRelations > -Dtests.seed=721195D0198A8470 -Dtests.slow=true -Dtests.badapples=true > -Dtests.locale=sr-RS -Dtests.timezone=Europe/Istanbul -Dtests.asserts=true > -Dtests.file.encoding=ISO-8859-1{code} > Error: > {code:java} > [junit4] FAILURE 1.16s | TestGeo3DPoint.testGeo3DRelations <<< > [junit4] > Throwable #1: java.lang.AssertionError: invalid hits for > shape=GeoStandardPath: {planetmodel=PlanetModel.WGS84, > width=1.3439035240356338(77.01), > points={[[lat=2.4457272005608357E-47, > lon=0.017453291479645996([X=1.0009663787601641, Y=0.017471932090601616, > Z=2.448463612203698E-47])], [lat=2.4457272005608357E-47, > lon=0.8952476719156919([X=0.6260252093310985, Y=0.7812370940381473, > Z=2.448463612203698E-47])], [lat=2.4457272005608357E-47, > lon=0.6491968536639036([X=0.7974608400583222, Y=0.6052232384770843, > Z=2.448463612203698E-47])], [lat=-0.7718789008737459, > lon=0.9236607495528212([X=0.43181767034308555, Y=0.5714183775701452, > Z=-0.6971214014446648])]]}}{code} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-8696) TestGeo3DPoint.testGeo3DRelations failure
[ https://issues.apache.org/jira/browse/LUCENE-8696?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16778005#comment-16778005 ] Karl Wright commented on LUCENE-8696: - I have confirmed that the above is indeed the issue. I did this by checking whether intersection with the segment end planes was taking place, and it was. There are two ways forward. First way is to make this hack officially part of the code base. That will probably be fine for real-world paths, because real-world paths are much narrower than what occurs in random testing. The second fix would be to change how we represent segment endpoints, so that there is no gap between one of the points and the adjoining path segment. The way to do that is to use TWO planes rather than one, but only when there are two adjoining segments and a gap is thus present. Membership would be tricky because, depending on the specific conformation of the segment endpoint, EITHER plane or BOTH planes would need to match the point being tested. But we could determine this by simply looking at the fourth point in the context of a plane constructed from the other three. Such a change would finally make GeoPaths first-class citizens in the oblate world, at the cost of needing to have a second plane for each segment endpoint. But there's no reason we can't use class inheritance to solve that problem too. So a base SegmentEndpoint class or interface would have multiple implementations, and the right one could be picked at path construction time, to match the conformation. For SPHERE planets, the simplest implementation would still be the one that got used. > TestGeo3DPoint.testGeo3DRelations failure > - > > Key: LUCENE-8696 > URL: https://issues.apache.org/jira/browse/LUCENE-8696 > Project: Lucene - Core > Issue Type: Bug > Components: modules/spatial3d >Reporter: Ignacio Vera > Assignee: Karl Wright >Priority: Major > Attachments: LUCENE-8696.patch > > > Reproduce with: > {code:java} > ant test -Dtestcase=TestGeo3DPoint -Dtests.method=testGeo3DRelations > -Dtests.seed=721195D0198A8470 -Dtests.slow=true -Dtests.badapples=true > -Dtests.locale=sr-RS -Dtests.timezone=Europe/Istanbul -Dtests.asserts=true > -Dtests.file.encoding=ISO-8859-1{code} > Error: > {code:java} > [junit4] FAILURE 1.16s | TestGeo3DPoint.testGeo3DRelations <<< > [junit4] > Throwable #1: java.lang.AssertionError: invalid hits for > shape=GeoStandardPath: {planetmodel=PlanetModel.WGS84, > width=1.3439035240356338(77.01), > points={[[lat=2.4457272005608357E-47, > lon=0.017453291479645996([X=1.0009663787601641, Y=0.017471932090601616, > Z=2.448463612203698E-47])], [lat=2.4457272005608357E-47, > lon=0.8952476719156919([X=0.6260252093310985, Y=0.7812370940381473, > Z=2.448463612203698E-47])], [lat=2.4457272005608357E-47, > lon=0.6491968536639036([X=0.7974608400583222, Y=0.6052232384770843, > Z=2.448463612203698E-47])], [lat=-0.7718789008737459, > lon=0.9236607495528212([X=0.43181767034308555, Y=0.5714183775701452, > Z=-0.6971214014446648])]]}}{code} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-8696) TestGeo3DPoint.testGeo3DRelations failure
[ https://issues.apache.org/jira/browse/LUCENE-8696?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16777688#comment-16777688 ] Karl Wright commented on LUCENE-8696: - Since we've eliminated the computation of the solid's example intersection points, that basically leaves numerical factors as the only potential cause. Let's examine this further. In the case of GeoPaths on the WGS84 globe, path intersection points are described by "circles", which are in fact just planes that are picked so as to connect the path segments together, as described above. But, each plane with two adjoining segments is selected based on FOUR surface points, not three. That means that there is a gap between one of the points and the actual endpoint circle. When we compute membership in the path, we exclude points in that gap from membership. This is done by considering the path segment end planes as delimiters of membership for both the endpoint "circles" as well as the segments. But, those segment end planes are not considered when determining intersection, because they are "interior" to the path. This means that it is possible for getRelationship() to miss an intersection with the path edge if the "gap" is large enough and everything lines up perfectly, and thus "CONTAINS" is reported where "OVERLAPS" would be the actual correct answer. It should be possible to see if our test case would be resolved by considering path segment end edges. A simple trial code change should be sufficient to know. Then the question becomes how to prevent spurious intersections? We could just permit them (it's allowed in the contract), or we could make more significant changes to path representation, for better accuracy. Stay tuned. > TestGeo3DPoint.testGeo3DRelations failure > - > > Key: LUCENE-8696 > URL: https://issues.apache.org/jira/browse/LUCENE-8696 > Project: Lucene - Core > Issue Type: Bug > Components: modules/spatial3d >Reporter: Ignacio Vera >Assignee: Karl Wright >Priority: Major > Attachments: LUCENE-8696.patch > > > Reproduce with: > {code:java} > ant test -Dtestcase=TestGeo3DPoint -Dtests.method=testGeo3DRelations > -Dtests.seed=721195D0198A8470 -Dtests.slow=true -Dtests.badapples=true > -Dtests.locale=sr-RS -Dtests.timezone=Europe/Istanbul -Dtests.asserts=true > -Dtests.file.encoding=ISO-8859-1{code} > Error: > {code:java} > [junit4] FAILURE 1.16s | TestGeo3DPoint.testGeo3DRelations <<< > [junit4] > Throwable #1: java.lang.AssertionError: invalid hits for > shape=GeoStandardPath: {planetmodel=PlanetModel.WGS84, > width=1.3439035240356338(77.01), > points={[[lat=2.4457272005608357E-47, > lon=0.017453291479645996([X=1.0009663787601641, Y=0.017471932090601616, > Z=2.448463612203698E-47])], [lat=2.4457272005608357E-47, > lon=0.8952476719156919([X=0.6260252093310985, Y=0.7812370940381473, > Z=2.448463612203698E-47])], [lat=2.4457272005608357E-47, > lon=0.6491968536639036([X=0.7974608400583222, Y=0.6052232384770843, > Z=2.448463612203698E-47])], [lat=-0.7718789008737459, > lon=0.9236607495528212([X=0.43181767034308555, Y=0.5714183775701452, > Z=-0.6971214014446648])]]}}{code} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Comment Edited] (LUCENE-8696) TestGeo3DPoint.testGeo3DRelations failure
[ https://issues.apache.org/jira/browse/LUCENE-8696?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16776899#comment-16776899 ] Karl Wright edited comment on LUCENE-8696 at 2/26/19 7:24 AM: -- Reviewing the solid, and what the edge points *should* be: minx, maxx: -0.7731590077686981, 1.0011188539924791 miny, maxy: 0.9519964046486451, 1.0011188539924791 minz, maxz: -0.9977622932859775, 0.9977599768255027 The minz/maxz planes might touch the world at the poles, but probably don't. The maxx plane might touch the world at the max X pole. The minx plane definitely slices the world, so it should generate at least one point. The maxy plane might touch the world at the max Y pole. The miny plane slices the world, so it should generate at least one point. This is the debugging output: {code} [junit4] 2> notableMinXPoints=[] notableMaxXPoints=[] notableMinYPoints=[] notableMaxYPoints=[] notableMinZPoints=[] notableMaxZPoints=[] [junit4] 2> minXEdges=[] maxXEdges=[] minYEdges=[[X=0.0, Y=0.9519964046486451, Z=-0.30870622678085735]] maxYEdges=[[X=-0.0, Y=1.0011188539924791, Z=0.0]] minZEdges=[] maxZEdges=[] {code} "Notable points" are places where the plane intersections also intersect the world. There are none of these, as expected. The planes that intersect the world are minY and maxY. We do *not* see intersections for minX, though, and we expected to. That's got to be researched to figure out why. It may be because the intersection is actually outside the solid bounds as determined by the Y plane. So the question becomes whether the line (-0.7731590077686981, 0.9519964046486451, t) ever can go through the world? We can surely determine that by picking value 0, and computing the distance to the origin: sqrt(x^2 + y^2 + 0) = 1.2264061340998847885343642874005, which is indeed off the surface. So the points look reasonable. was (Author: kwri...@metacarta.com): Reviewing the solid, and what the edge points *should* be: minx, maxx: -0.7731590077686981, 1.0011188539924791 miny, maxy: 0.9519964046486451, 1.0011188539924791 minz, maxz: -0.9977622932859775, 0.9977599768255027 The minz/maxz planes might touch the world at the poles, but probably don't. The maxx plane might touch the world at the max X pole. The minx plane definitely slices the world, so it should generate at least one point. The maxy plane might touch the world at the max Y pole. The miny plane slices the world, so it should generate at least one point. This is the debugging output: {code} [junit4] 2> notableMinXPoints=[] notableMaxXPoints=[] notableMinYPoints=[] notableMaxYPoints=[] notableMinZPoints=[] notableMaxZPoints=[] [junit4] 2> minXEdges=[] maxXEdges=[] minYEdges=[[X=0.0, Y=0.9519964046486451, Z=-0.30870622678085735]] maxYEdges=[[X=-0.0, Y=1.0011188539924791, Z=0.0]] minZEdges=[] maxZEdges=[] {code} "Notable points" are places where the plane intersections also intersect the world. There are none of these, as expected. The planes that intersect the world are minY and maxY. We do *not* see intersections for minX, though, and we expected to. That's got to be researched to figure out why. It may be because the intersection is actually outside the solid bounds as determined by the Y plane. Out of time for the moment though. > TestGeo3DPoint.testGeo3DRelations failure > - > > Key: LUCENE-8696 > URL: https://issues.apache.org/jira/browse/LUCENE-8696 > Project: Lucene - Core > Issue Type: Bug > Components: modules/spatial3d > Reporter: Ignacio Vera >Assignee: Karl Wright >Priority: Major > Attachments: LUCENE-8696.patch > > > Reproduce with: > {code:java} > ant test -Dtestcase=TestGeo3DPoint -Dtests.method=testGeo3DRelations > -Dtests.seed=721195D0198A8470 -Dtests.slow=true -Dtests.badapples=true > -Dtests.locale=sr-RS -Dtests.timezone=Europe/Istanbul -Dtests.asserts=true > -Dtests.file.encoding=ISO-8859-1{code} > Error: > {code:java} > [junit4] FAILURE 1.16s | TestGeo3DPoint.testGeo3DRelations <<< > [junit4] > Throwable #1: java.lang.AssertionError: invalid hits for > shape=GeoStandardPath: {planetmodel=PlanetModel.WGS84, > width=1.3439035240356338(77.01), > points={[[lat=2.4457272005608357E-47, > lon=0.017453291479645996([X=1.0009663787601641, Y=0.017471932090601616, > Z=2.448463612203698E-47])], [lat=2.4457272005608357E-47, > lon=0.8952476719156919([X=0.6260252093310985, Y=0.7812370940381473, > Z=2.448463612203698E-47])], [lat=2.4457272005608357E-47, > lon=0.6491968536639036([X=0.7974608400583222, Y=0.6052232384770843, > Z=2.44846361220369
[jira] [Commented] (SOLR-13270) SolrJ does not send "Expect: 100-continue" header
[ https://issues.apache.org/jira/browse/SOLR-13270?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16777186#comment-16777186 ] Karl Wright commented on SOLR-13270: I just grepped for it and did not find it explicitly set: {code} kawright@1USDKAWRIGHT:/mnt/c/wipgit/lucene4/lucene-solr$ grep -R "setExpectContinue" . --include "*.java" kawright@1USDKAWRIGHT:/mnt/c/wipgit/lucene4/lucene-solr$ {code} I therefore believe it's being set because the RequestConfig is being overwritten. And, sure enough: {code} kawright@1USDKAWRIGHT:/mnt/c/wipgit/lucene4/lucene-solr$ grep -R "setDefaultRequestConfig" . --include "*.java" ./lucene/replicator/src/java/org/apache/lucene/replicator/http/HttpClientBase.java: httpc = HttpClientBuilder.create().setConnectionManager(conMgr).setDefaultRequestConfig(this.defaultConfig).build(); ./solr/solrj/src/java/org/apache/solr/client/solrj/impl/HttpClientUtil.java: HttpClientBuilder retBuilder = builder.setDefaultRequestConfig(requestConfig); kawright@1USDKAWRIGHT:/mnt/c/wipgit/lucene4/lucene-solr$ {code} > SolrJ does not send "Expect: 100-continue" header > - > > Key: SOLR-13270 > URL: https://issues.apache.org/jira/browse/SOLR-13270 > Project: Solr > Issue Type: Bug > Security Level: Public(Default Security Level. Issues are Public) > Components: SolrJ >Affects Versions: 7.7 >Reporter: Erlend Garåsen >Priority: Major > > SolrJ does not set the "Expect: 100-continue" header, even though it's > configured in HttpClient: > {code:java} > builder.setDefaultRequestConfig(RequestConfig.custom().setExpectContinueEnabled(true).build());{code} > A HttpClient developer has reviewed the code and says we're setting up > the client correctly, so we have a reason to believe there is a bug in > SolrJ. It's actually a problem we are facing in ManifoldCF, explained in: > https://issues.apache.org/jira/browse/CONNECTORS-1564 > The problem can be reproduced by building and running the following small > Maven project: > [http://folk.uio.no/erlendfg/solr/missing-header.zip] > The application runs SolrJ code where the header does not show up and > HttpClient code where the header is present. > > {code:java} > HttpClientBuilder builder = HttpClients.custom(); > // This should add an Expect: 100-continue header: > builder.setDefaultRequestConfig(RequestConfig.custom().setExpectContinueEnabled(true).build()); > HttpClient httpClient = builder.build(); > // Start Solr and create a core named "test". > String baseUrl = "http://localhost:8983/solr/test;; > // Test using SolrJ — no expect 100 header > HttpSolrClient client = new HttpSolrClient.Builder() > .withHttpClient(httpClient) > .withBaseSolrUrl(baseUrl).build(); > SolrQuery query = new SolrQuery(); > query.setQuery("*:*"); > client.query(query); > // Test using HttpClient directly — expect 100 header shows up: > HttpPost httpPost = new HttpPost(baseUrl); > HttpEntity entity = new InputStreamEntity(new > ByteArrayInputStream("test".getBytes())); > httpPost.setEntity(entity); > httpClient.execute(httpPost); > {code} > When using the last HttpClient test, the expect 100 header appears in > missing-header.log: > {noformat} > http-outgoing-1 >> Expect: 100-continue{noformat} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-13270) SolrJ does not send "Expect: 100-continue" header
[ https://issues.apache.org/jira/browse/SOLR-13270?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16777113#comment-16777113 ] Karl Wright commented on SOLR-13270: Hi [~erlendfg], can you identify where in the SolrJ code it explicitly sets expect/continue to "off"? It must be there somewhere. > SolrJ does not send "Expect: 100-continue" header > - > > Key: SOLR-13270 > URL: https://issues.apache.org/jira/browse/SOLR-13270 > Project: Solr > Issue Type: Bug > Security Level: Public(Default Security Level. Issues are Public) > Components: SolrJ >Affects Versions: 7.7 >Reporter: Erlend Garåsen >Priority: Major > > SolrJ does not set the "Expect: 100-continue" header, even though it's > configured in HttpClient: > {code:java} > builder.setDefaultRequestConfig(RequestConfig.custom().setExpectContinueEnabled(true).build());{code} > A HttpClient developer has reviewed the code and says we're setting up > the client correctly, so we have a reason to believe there is a bug in > SolrJ. It's actually a problem we are facing in ManifoldCF, explained in: > https://issues.apache.org/jira/browse/CONNECTORS-1564 > The problem can be reproduced by building and running the following small > Maven project: > [http://folk.uio.no/erlendfg/solr/missing-header.zip] > The application runs SolrJ code where the header does not show up and > HttpClient code where the header is present. > > {code:java} > HttpClientBuilder builder = HttpClients.custom(); > // This should add an Expect: 100-continue header: > builder.setDefaultRequestConfig(RequestConfig.custom().setExpectContinueEnabled(true).build()); > HttpClient httpClient = builder.build(); > // Start Solr and create a core named "test". > String baseUrl = "http://localhost:8983/solr/test;; > // Test using SolrJ — no expect 100 header > HttpSolrClient client = new HttpSolrClient.Builder() > .withHttpClient(httpClient) > .withBaseSolrUrl(baseUrl).build(); > SolrQuery query = new SolrQuery(); > query.setQuery("*:*"); > client.query(query); > // Test using HttpClient directly — expect 100 header shows up: > HttpPost httpPost = new HttpPost(baseUrl); > HttpEntity entity = new InputStreamEntity(new > ByteArrayInputStream("test".getBytes())); > httpPost.setEntity(entity); > httpClient.execute(httpPost); > {code} > When using the last HttpClient test, the expect 100 header appears in > missing-header.log: > {noformat} > http-outgoing-1 >> Expect: 100-continue{noformat} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-8696) TestGeo3DPoint.testGeo3DRelations failure
[ https://issues.apache.org/jira/browse/LUCENE-8696?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16776975#comment-16776975 ] Karl Wright commented on LUCENE-8696: - [~jpountz], should be addressed now. > TestGeo3DPoint.testGeo3DRelations failure > - > > Key: LUCENE-8696 > URL: https://issues.apache.org/jira/browse/LUCENE-8696 > Project: Lucene - Core > Issue Type: Bug > Components: modules/spatial3d >Reporter: Ignacio Vera > Assignee: Karl Wright >Priority: Major > Attachments: LUCENE-8696.patch > > > Reproduce with: > {code:java} > ant test -Dtestcase=TestGeo3DPoint -Dtests.method=testGeo3DRelations > -Dtests.seed=721195D0198A8470 -Dtests.slow=true -Dtests.badapples=true > -Dtests.locale=sr-RS -Dtests.timezone=Europe/Istanbul -Dtests.asserts=true > -Dtests.file.encoding=ISO-8859-1{code} > Error: > {code:java} > [junit4] FAILURE 1.16s | TestGeo3DPoint.testGeo3DRelations <<< > [junit4] > Throwable #1: java.lang.AssertionError: invalid hits for > shape=GeoStandardPath: {planetmodel=PlanetModel.WGS84, > width=1.3439035240356338(77.01), > points={[[lat=2.4457272005608357E-47, > lon=0.017453291479645996([X=1.0009663787601641, Y=0.017471932090601616, > Z=2.448463612203698E-47])], [lat=2.4457272005608357E-47, > lon=0.8952476719156919([X=0.6260252093310985, Y=0.7812370940381473, > Z=2.448463612203698E-47])], [lat=2.4457272005608357E-47, > lon=0.6491968536639036([X=0.7974608400583222, Y=0.6052232384770843, > Z=2.448463612203698E-47])], [lat=-0.7718789008737459, > lon=0.9236607495528212([X=0.43181767034308555, Y=0.5714183775701452, > Z=-0.6971214014446648])]]}}{code} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Comment Edited] (LUCENE-8696) TestGeo3DPoint.testGeo3DRelations failure
[ https://issues.apache.org/jira/browse/LUCENE-8696?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16776899#comment-16776899 ] Karl Wright edited comment on LUCENE-8696 at 2/25/19 2:39 PM: -- Reviewing the solid, and what the edge points *should* be: minx, maxx: -0.7731590077686981, 1.0011188539924791 miny, maxy: 0.9519964046486451, 1.0011188539924791 minz, maxz: -0.9977622932859775, 0.9977599768255027 The minz/maxz planes might touch the world at the poles, but probably don't. The maxx plane might touch the world at the max X pole. The minx plane definitely slices the world, so it should generate at least one point. The maxy plane might touch the world at the max Y pole. The miny plane slices the world, so it should generate at least one point. This is the debugging output: {code} [junit4] 2> notableMinXPoints=[] notableMaxXPoints=[] notableMinYPoints=[] notableMaxYPoints=[] notableMinZPoints=[] notableMaxZPoints=[] [junit4] 2> minXEdges=[] maxXEdges=[] minYEdges=[[X=0.0, Y=0.9519964046486451, Z=-0.30870622678085735]] maxYEdges=[[X=-0.0, Y=1.0011188539924791, Z=0.0]] minZEdges=[] maxZEdges=[] {code} "Notable points" are places where the plane intersections also intersect the world. There are none of these, as expected. The planes that intersect the world are minY and maxY. We do *not* see intersections for minX, though, and we expected to. That's got to be researched to figure out why. It may be because the intersection is actually outside the solid bounds as determined by the Y plane. Out of time for the moment though. was (Author: kwri...@metacarta.com): Reviewing the solid, and what the edge points *should* be: minx, maxx: -0.7731590077686981, 1.0011188539924791 miny, maxy: 0.9519964046486451, 1.0011188539924791 minz, maxz: -0.9977622932859775, 0.9977599768255027 The minz/maxz planes might touch the world at the poles, but probably don't. The maxx plane might touch the world at the max X pole. The minx plane definitely slices the world, so it should generate at least one point. The maxy plane might touch the world at the max Y pole. The miny plane slices the world, so it should generate at least one point. We therefore should expect a minimum of two points, which is what we see. If any of these planes actually encounters the pole, though, we should have gotten another point from that. The maxZ plane looks potentially like it might qualify. Out of time for the moment though. > TestGeo3DPoint.testGeo3DRelations failure > - > > Key: LUCENE-8696 > URL: https://issues.apache.org/jira/browse/LUCENE-8696 > Project: Lucene - Core > Issue Type: Bug > Components: modules/spatial3d >Reporter: Ignacio Vera >Assignee: Karl Wright >Priority: Major > Attachments: LUCENE-8696.patch > > > Reproduce with: > {code:java} > ant test -Dtestcase=TestGeo3DPoint -Dtests.method=testGeo3DRelations > -Dtests.seed=721195D0198A8470 -Dtests.slow=true -Dtests.badapples=true > -Dtests.locale=sr-RS -Dtests.timezone=Europe/Istanbul -Dtests.asserts=true > -Dtests.file.encoding=ISO-8859-1{code} > Error: > {code:java} > [junit4] FAILURE 1.16s | TestGeo3DPoint.testGeo3DRelations <<< > [junit4] > Throwable #1: java.lang.AssertionError: invalid hits for > shape=GeoStandardPath: {planetmodel=PlanetModel.WGS84, > width=1.3439035240356338(77.01), > points={[[lat=2.4457272005608357E-47, > lon=0.017453291479645996([X=1.0009663787601641, Y=0.017471932090601616, > Z=2.448463612203698E-47])], [lat=2.4457272005608357E-47, > lon=0.8952476719156919([X=0.6260252093310985, Y=0.7812370940381473, > Z=2.448463612203698E-47])], [lat=2.4457272005608357E-47, > lon=0.6491968536639036([X=0.7974608400583222, Y=0.6052232384770843, > Z=2.448463612203698E-47])], [lat=-0.7718789008737459, > lon=0.9236607495528212([X=0.43181767034308555, Y=0.5714183775701452, > Z=-0.6971214014446648])]]}}{code} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-8696) TestGeo3DPoint.testGeo3DRelations failure
[ https://issues.apache.org/jira/browse/LUCENE-8696?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16776899#comment-16776899 ] Karl Wright commented on LUCENE-8696: - Reviewing the solid, and what the edge points *should* be: minx, maxx: -0.7731590077686981, 1.0011188539924791 miny, maxy: 0.9519964046486451, 1.0011188539924791 minz, maxz: -0.9977622932859775, 0.9977599768255027 The minz/maxz planes might touch the world at the poles, but probably don't. The maxx plane might touch the world at the max X pole. The minx plane definitely slices the world, so it should generate at least one point. The maxy plane might touch the world at the max Y pole. The miny plane slices the world, so it should generate at least one point. We therefore should expect a minimum of two points, which is what we see. If any of these planes actually encounters the pole, though, we should have gotten another point from that. The maxZ plane looks potentially like it might qualify. Out of time for the moment though. > TestGeo3DPoint.testGeo3DRelations failure > - > > Key: LUCENE-8696 > URL: https://issues.apache.org/jira/browse/LUCENE-8696 > Project: Lucene - Core > Issue Type: Bug > Components: modules/spatial3d >Reporter: Ignacio Vera > Assignee: Karl Wright >Priority: Major > Attachments: LUCENE-8696.patch > > > Reproduce with: > {code:java} > ant test -Dtestcase=TestGeo3DPoint -Dtests.method=testGeo3DRelations > -Dtests.seed=721195D0198A8470 -Dtests.slow=true -Dtests.badapples=true > -Dtests.locale=sr-RS -Dtests.timezone=Europe/Istanbul -Dtests.asserts=true > -Dtests.file.encoding=ISO-8859-1{code} > Error: > {code:java} > [junit4] FAILURE 1.16s | TestGeo3DPoint.testGeo3DRelations <<< > [junit4] > Throwable #1: java.lang.AssertionError: invalid hits for > shape=GeoStandardPath: {planetmodel=PlanetModel.WGS84, > width=1.3439035240356338(77.01), > points={[[lat=2.4457272005608357E-47, > lon=0.017453291479645996([X=1.0009663787601641, Y=0.017471932090601616, > Z=2.448463612203698E-47])], [lat=2.4457272005608357E-47, > lon=0.8952476719156919([X=0.6260252093310985, Y=0.7812370940381473, > Z=2.448463612203698E-47])], [lat=2.4457272005608357E-47, > lon=0.6491968536639036([X=0.7974608400583222, Y=0.6052232384770843, > Z=2.448463612203698E-47])], [lat=-0.7718789008737459, > lon=0.9236607495528212([X=0.43181767034308555, Y=0.5714183775701452, > Z=-0.6971214014446648])]]}}{code} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-8696) TestGeo3DPoint.testGeo3DRelations failure
[ https://issues.apache.org/jira/browse/LUCENE-8696?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16776881#comment-16776881 ] Karl Wright commented on LUCENE-8696: - Reviewing the solid edge point logic finds nothing wrong. Will try to rule out numerical precision problems next. > TestGeo3DPoint.testGeo3DRelations failure > - > > Key: LUCENE-8696 > URL: https://issues.apache.org/jira/browse/LUCENE-8696 > Project: Lucene - Core > Issue Type: Bug > Components: modules/spatial3d >Reporter: Ignacio Vera > Assignee: Karl Wright >Priority: Major > Attachments: LUCENE-8696.patch > > > Reproduce with: > {code:java} > ant test -Dtestcase=TestGeo3DPoint -Dtests.method=testGeo3DRelations > -Dtests.seed=721195D0198A8470 -Dtests.slow=true -Dtests.badapples=true > -Dtests.locale=sr-RS -Dtests.timezone=Europe/Istanbul -Dtests.asserts=true > -Dtests.file.encoding=ISO-8859-1{code} > Error: > {code:java} > [junit4] FAILURE 1.16s | TestGeo3DPoint.testGeo3DRelations <<< > [junit4] > Throwable #1: java.lang.AssertionError: invalid hits for > shape=GeoStandardPath: {planetmodel=PlanetModel.WGS84, > width=1.3439035240356338(77.01), > points={[[lat=2.4457272005608357E-47, > lon=0.017453291479645996([X=1.0009663787601641, Y=0.017471932090601616, > Z=2.448463612203698E-47])], [lat=2.4457272005608357E-47, > lon=0.8952476719156919([X=0.6260252093310985, Y=0.7812370940381473, > Z=2.448463612203698E-47])], [lat=2.4457272005608357E-47, > lon=0.6491968536639036([X=0.7974608400583222, Y=0.6052232384770843, > Z=2.448463612203698E-47])], [lat=-0.7718789008737459, > lon=0.9236607495528212([X=0.43181767034308555, Y=0.5714183775701452, > Z=-0.6971214014446648])]]}}{code} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-8696) TestGeo3DPoint.testGeo3DRelations failure
[ https://issues.apache.org/jira/browse/LUCENE-8696?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16776841#comment-16776841 ] Karl Wright commented on LUCENE-8696: - I've verified that there are two solid edge points and they both lie within the path: {code} [junit4] 2> solid edge point [X=0.0, Y=0.9519964046486451, Z=-0.30870622678085735] path.isWithin()? true [junit4] 2> solid edge point [X=-0.0, Y=1.0011188539924791, Z=0.0] path.isWithin()? true [junit4] 2> path edge point [X=0.22516844226485835, Y=0.003930329545205224, Z=0.9721897091178435] isWithin()? false minx=0.9983274500335564 maxx=-0.7759504117276208 miny=-0.9480660751034399 maxy=-0.9971885244472739 minz=1.969952002403821 maxz=-0.025570267707659133 {code} So this confirms that there is no intersection detected, and how the conclusion that the solid is completely within the path is arrived at. Possible errors that would cause this: (1) We might be missing a solid edge point. These edge points are computed based on the lines of intersection between adjoining solid planes and the surface of the world. There is also special computation to handle the case where a solid edge plane intersects the world by itself, but this logic might not be complete. We need to capture all plane/world intersection closed curves and come up with an example point for each. (2) There might be numerical precision issues with intersection computation that prevent us from concluding that the path edges intersect the solid edges. I still have to figure out which is the real problem here. > TestGeo3DPoint.testGeo3DRelations failure > - > > Key: LUCENE-8696 > URL: https://issues.apache.org/jira/browse/LUCENE-8696 > Project: Lucene - Core > Issue Type: Bug > Components: modules/spatial3d >Reporter: Ignacio Vera >Assignee: Karl Wright >Priority: Major > Attachments: LUCENE-8696.patch > > > Reproduce with: > {code:java} > ant test -Dtestcase=TestGeo3DPoint -Dtests.method=testGeo3DRelations > -Dtests.seed=721195D0198A8470 -Dtests.slow=true -Dtests.badapples=true > -Dtests.locale=sr-RS -Dtests.timezone=Europe/Istanbul -Dtests.asserts=true > -Dtests.file.encoding=ISO-8859-1{code} > Error: > {code:java} > [junit4] FAILURE 1.16s | TestGeo3DPoint.testGeo3DRelations <<< > [junit4] > Throwable #1: java.lang.AssertionError: invalid hits for > shape=GeoStandardPath: {planetmodel=PlanetModel.WGS84, > width=1.3439035240356338(77.01), > points={[[lat=2.4457272005608357E-47, > lon=0.017453291479645996([X=1.0009663787601641, Y=0.017471932090601616, > Z=2.448463612203698E-47])], [lat=2.4457272005608357E-47, > lon=0.8952476719156919([X=0.6260252093310985, Y=0.7812370940381473, > Z=2.448463612203698E-47])], [lat=2.4457272005608357E-47, > lon=0.6491968536639036([X=0.7974608400583222, Y=0.6052232384770843, > Z=2.448463612203698E-47])], [lat=-0.7718789008737459, > lon=0.9236607495528212([X=0.43181767034308555, Y=0.5714183775701452, > Z=-0.6971214014446648])]]}}{code} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-8696) TestGeo3DPoint.testGeo3DRelations failure
[ https://issues.apache.org/jira/browse/LUCENE-8696?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16776821#comment-16776821 ] Karl Wright commented on LUCENE-8696: - Looking at the actual failure now. Basically, problem is that the relationship between the XYZSolid and the GeoPath is containment: the XYZSolid is reported to be inside the GeoPath. It reaches this conclusion because it detects no intersections between the solid and the path edges, and because the path edge point it is using is outside the solid: {code} [junit4] 1> in isShapeInsideArea [junit4] 1> there are 1 pathPoints [junit4] 1> pathpoint [X=0.22516844226485835, Y=0.003930329545205224, Z=0.9721897091178435]... [junit4] 1> outside {code} Haven't verified it yet, but this implies that at least one of the solid's surface points is inside of the path too. Still too early to know which conclusion is incorrect. > TestGeo3DPoint.testGeo3DRelations failure > - > > Key: LUCENE-8696 > URL: https://issues.apache.org/jira/browse/LUCENE-8696 > Project: Lucene - Core > Issue Type: Bug > Components: modules/spatial3d >Reporter: Ignacio Vera >Assignee: Karl Wright >Priority: Major > Attachments: LUCENE-8696.patch > > > Reproduce with: > {code:java} > ant test -Dtestcase=TestGeo3DPoint -Dtests.method=testGeo3DRelations > -Dtests.seed=721195D0198A8470 -Dtests.slow=true -Dtests.badapples=true > -Dtests.locale=sr-RS -Dtests.timezone=Europe/Istanbul -Dtests.asserts=true > -Dtests.file.encoding=ISO-8859-1{code} > Error: > {code:java} > [junit4] FAILURE 1.16s | TestGeo3DPoint.testGeo3DRelations <<< > [junit4] > Throwable #1: java.lang.AssertionError: invalid hits for > shape=GeoStandardPath: {planetmodel=PlanetModel.WGS84, > width=1.3439035240356338(77.01), > points={[[lat=2.4457272005608357E-47, > lon=0.017453291479645996([X=1.0009663787601641, Y=0.017471932090601616, > Z=2.448463612203698E-47])], [lat=2.4457272005608357E-47, > lon=0.8952476719156919([X=0.6260252093310985, Y=0.7812370940381473, > Z=2.448463612203698E-47])], [lat=2.4457272005608357E-47, > lon=0.6491968536639036([X=0.7974608400583222, Y=0.6052232384770843, > Z=2.448463612203698E-47])], [lat=-0.7718789008737459, > lon=0.9236607495528212([X=0.43181767034308555, Y=0.5714183775701452, > Z=-0.6971214014446648])]]}}{code} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-8696) TestGeo3DPoint.testGeo3DRelations failure
[ https://issues.apache.org/jira/browse/LUCENE-8696?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16776637#comment-16776637 ] Karl Wright commented on LUCENE-8696: - I revised the simple test case to match the actual failure, and committed it with @AwaitsFix. I'm now committing to master and to master, branch_7x, and branch_8x. No further fixes for branch_6x. > TestGeo3DPoint.testGeo3DRelations failure > - > > Key: LUCENE-8696 > URL: https://issues.apache.org/jira/browse/LUCENE-8696 > Project: Lucene - Core > Issue Type: Bug > Components: modules/spatial3d >Reporter: Ignacio Vera > Assignee: Karl Wright >Priority: Major > Attachments: LUCENE-8696.patch > > > Reproduce with: > {code:java} > ant test -Dtestcase=TestGeo3DPoint -Dtests.method=testGeo3DRelations > -Dtests.seed=721195D0198A8470 -Dtests.slow=true -Dtests.badapples=true > -Dtests.locale=sr-RS -Dtests.timezone=Europe/Istanbul -Dtests.asserts=true > -Dtests.file.encoding=ISO-8859-1{code} > Error: > {code:java} > [junit4] FAILURE 1.16s | TestGeo3DPoint.testGeo3DRelations <<< > [junit4] > Throwable #1: java.lang.AssertionError: invalid hits for > shape=GeoStandardPath: {planetmodel=PlanetModel.WGS84, > width=1.3439035240356338(77.01), > points={[[lat=2.4457272005608357E-47, > lon=0.017453291479645996([X=1.0009663787601641, Y=0.017471932090601616, > Z=2.448463612203698E-47])], [lat=2.4457272005608357E-47, > lon=0.8952476719156919([X=0.6260252093310985, Y=0.7812370940381473, > Z=2.448463612203698E-47])], [lat=2.4457272005608357E-47, > lon=0.6491968536639036([X=0.7974608400583222, Y=0.6052232384770843, > Z=2.448463612203698E-47])], [lat=-0.7718789008737459, > lon=0.9236607495528212([X=0.43181767034308555, Y=0.5714183775701452, > Z=-0.6971214014446648])]]}}{code} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-8696) TestGeo3DPoint.testGeo3DRelations failure
[ https://issues.apache.org/jira/browse/LUCENE-8696?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16776617#comment-16776617 ] Karl Wright commented on LUCENE-8696: - [~ivera], I'm looking at your test case for reproducing the original failure and I honestly can't find any place in testGeo3DRelations where we expect two paths with different widths to exactly fit inside one another. The only relationships that are computed in this test are between an xyz solid and a path. Can you describe how you came up with the simplified test case? > TestGeo3DPoint.testGeo3DRelations failure > - > > Key: LUCENE-8696 > URL: https://issues.apache.org/jira/browse/LUCENE-8696 > Project: Lucene - Core > Issue Type: Bug > Components: modules/spatial3d >Reporter: Ignacio Vera > Assignee: Karl Wright >Priority: Major > Attachments: LUCENE-8696.patch > > > Reproduce with: > {code:java} > ant test -Dtestcase=TestGeo3DPoint -Dtests.method=testGeo3DRelations > -Dtests.seed=721195D0198A8470 -Dtests.slow=true -Dtests.badapples=true > -Dtests.locale=sr-RS -Dtests.timezone=Europe/Istanbul -Dtests.asserts=true > -Dtests.file.encoding=ISO-8859-1{code} > Error: > {code:java} > [junit4] FAILURE 1.16s | TestGeo3DPoint.testGeo3DRelations <<< > [junit4] > Throwable #1: java.lang.AssertionError: invalid hits for > shape=GeoStandardPath: {planetmodel=PlanetModel.WGS84, > width=1.3439035240356338(77.01), > points={[[lat=2.4457272005608357E-47, > lon=0.017453291479645996([X=1.0009663787601641, Y=0.017471932090601616, > Z=2.448463612203698E-47])], [lat=2.4457272005608357E-47, > lon=0.8952476719156919([X=0.6260252093310985, Y=0.7812370940381473, > Z=2.448463612203698E-47])], [lat=2.4457272005608357E-47, > lon=0.6491968536639036([X=0.7974608400583222, Y=0.6052232384770843, > Z=2.448463612203698E-47])], [lat=-0.7718789008737459, > lon=0.9236607495528212([X=0.43181767034308555, Y=0.5714183775701452, > Z=-0.6971214014446648])]]}}{code} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-8696) TestGeo3DPoint.testGeo3DRelations failure
[ https://issues.apache.org/jira/browse/LUCENE-8696?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16776010#comment-16776010 ] Karl Wright commented on LUCENE-8696: - More debugging shows that the second circle plane is wildly different in the two runs: {code} [junit4] 1> Checking 'iswithin' for 0.020717830200521595 0.9523290534985549 0.30699177254488114 [junit4] 1> pathPoint... [junit4] 1> outside of circle [A=0.9998476951745469, B=0.01745240539714465, C=-0.0, D=-0.5409068252602056, side=1.0] [junit4] 1> pathPoint... [junit4] 1> passes circle plane [A=0.7071067811865476, B=-0.7071067811865476, C=0.0, D=0.05929892163149414, side=-1.0] [junit4] 1> within! [junit4] 1> Checking 'iswithin' for 0.020717830200521595 0.9523290534985549 0.30699177254488114 [junit4] 1> pathPoint... [junit4] 1> outside of circle [A=0.9998476951745469, B=0.017452405397144648, C=-0.0, D=-0.22520274172912894, side=1.0] [junit4] 1> pathPoint... [junit4] 1> outside of circle [A=0.7863183388224225, B=-0.6178215519319035, C=0.0, D=-0.0021572780909792644, side=1.0] [junit4] 1> pathPoint... [junit4] 1> outside of cutoff plane [A=0.6045468388328157, B=-0.796569594986684, C=-3.0241383426688587E-48, D=0.0, side=1.0] [junit4] 1> pathPoint... [junit4] 1> outside of cutoff plane [A=-0.6885949363624547, B=-0.29030954074708304, C=-0.6644978436136604, D=0.0, side=1.0] [junit4] 1> segment... [junit4] 1> segment... [junit4] 1> segment... {code} For the successful run it's: [A=0.7071067811865476, B=-0.7071067811865476, C=0.0, D=0.05929892163149414, side=-1.0] For the failed run it's: [A=0.7863183388224225, B=-0.6178215519319035, C=0.0, D=-0.0021572780909792644, side=1.0] The naive expectation would be that the vector is identical (A,B,C), but the displacement differs (D). But because this is WGS84, that expectation is incorrect, because oblateness can affect the vector. Because of oblateness, the circle is constructed from three of the four points where the segment edges intersect. Which three it picks is random, but the hope is that the selection is not important. What this shows is that very wide paths on oblate spheroids are mathematically unrelatable to each other. This is not exactly surprising in retrospect; paths were originally designed for a SPHERE world and retrofitting them to WGS84 involved compromises. I therefore think the best approach might be to modify the test suite to limit the width of paths tested on WGS84. [~ivera], what do you think? > TestGeo3DPoint.testGeo3DRelations failure > - > > Key: LUCENE-8696 > URL: https://issues.apache.org/jira/browse/LUCENE-8696 > Project: Lucene - Core > Issue Type: Bug > Components: modules/spatial3d >Reporter: Ignacio Vera >Assignee: Karl Wright >Priority: Major > Attachments: LUCENE-8696.patch > > > Reproduce with: > {code:java} > ant test -Dtestcase=TestGeo3DPoint -Dtests.method=testGeo3DRelations > -Dtests.seed=721195D0198A8470 -Dtests.slow=true -Dtests.badapples=true > -Dtests.locale=sr-RS -Dtests.timezone=Europe/Istanbul -Dtests.asserts=true > -Dtests.file.encoding=ISO-8859-1{code} > Error: > {code:java} > [junit4] FAILURE 1.16s | TestGeo3DPoint.testGeo3DRelations <<< > [junit4] > Throwable #1: java.lang.AssertionError: invalid hits for > shape=GeoStandardPath: {planetmodel=PlanetModel.WGS84, > width=1.3439035240356338(77.01), > points={[[lat=2.4457272005608357E-47, > lon=0.017453291479645996([X=1.0009663787601641, Y=0.017471932090601616, > Z=2.448463612203698E-47])], [lat=2.4457272005608357E-47, > lon=0.8952476719156919([X=0.6260252093310985, Y=0.7812370940381473, > Z=2.448463612203698E-47])], [lat=2.4457272005608357E-47, > lon=0.6491968536639036([X=0.7974608400583222, Y=0.6052232384770843, > Z=2.448463612203698E-47])], [lat=-0.7718789008737459, > lon=0.9236607495528212([X=0.43181767034308555, Y=0.5714183775701452, > Z=-0.6971214014446648])]]}}{code} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-8696) TestGeo3DPoint.testGeo3DRelations failure
[ https://issues.apache.org/jira/browse/LUCENE-8696?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16776006#comment-16776006 ] Karl Wright commented on LUCENE-8696: - Added some simple diagnostics. The difference lies in the construction of the second circle plane: {code} [junit4] 1> Checking 'iswithin' for 0.020717830200521595 0.9523290534985549 0.30699177254488114 [junit4] 1> pathPoint... [junit4] 1> outside of circle [junit4] 1> pathPoint... [junit4] 1> within! [junit4] 1> Checking 'iswithin' for 0.020717830200521595 0.9523290534985549 0.30699177254488114 [junit4] 1> pathPoint... [junit4] 1> outside of circle [junit4] 1> pathPoint... [junit4] 1> outside of circle [junit4] 1> pathPoint... [junit4] 1> outside of cutoff plane [A=0.6045468388328157, B=-0.796569594986684, C=-3.0241383426688587E-48, D=0.0, side=1.0] [junit4] 1> pathPoint... [junit4] 1> outside of cutoff plane [A=-0.6885949363624547, B=-0.29030954074708304, C=-0.6644978436136604, D=0.0, side=1.0] [junit4] 1> segment... [junit4] 1> segment... [junit4] 1> segment... {code} So the second circle plane accepts the point in the narrower case, but rejects it in the wider case. Digging further. > TestGeo3DPoint.testGeo3DRelations failure > - > > Key: LUCENE-8696 > URL: https://issues.apache.org/jira/browse/LUCENE-8696 > Project: Lucene - Core > Issue Type: Bug > Components: modules/spatial3d >Reporter: Ignacio Vera >Assignee: Karl Wright >Priority: Major > Attachments: LUCENE-8696.patch > > > Reproduce with: > {code:java} > ant test -Dtestcase=TestGeo3DPoint -Dtests.method=testGeo3DRelations > -Dtests.seed=721195D0198A8470 -Dtests.slow=true -Dtests.badapples=true > -Dtests.locale=sr-RS -Dtests.timezone=Europe/Istanbul -Dtests.asserts=true > -Dtests.file.encoding=ISO-8859-1{code} > Error: > {code:java} > [junit4] FAILURE 1.16s | TestGeo3DPoint.testGeo3DRelations <<< > [junit4] > Throwable #1: java.lang.AssertionError: invalid hits for > shape=GeoStandardPath: {planetmodel=PlanetModel.WGS84, > width=1.3439035240356338(77.01), > points={[[lat=2.4457272005608357E-47, > lon=0.017453291479645996([X=1.0009663787601641, Y=0.017471932090601616, > Z=2.448463612203698E-47])], [lat=2.4457272005608357E-47, > lon=0.8952476719156919([X=0.6260252093310985, Y=0.7812370940381473, > Z=2.448463612203698E-47])], [lat=2.4457272005608357E-47, > lon=0.6491968536639036([X=0.7974608400583222, Y=0.6052232384770843, > Z=2.448463612203698E-47])], [lat=-0.7718789008737459, > lon=0.9236607495528212([X=0.43181767034308555, Y=0.5714183775701452, > Z=-0.6971214014446648])]]}}{code} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-8696) TestGeo3DPoint.testGeo3DRelations failure
[ https://issues.apache.org/jira/browse/LUCENE-8696?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16776002#comment-16776002 ] Karl Wright commented on LUCENE-8696: - Hmm, even when I use createSurfacePoint() with this point, it still fails. So I need to look deeper. > TestGeo3DPoint.testGeo3DRelations failure > - > > Key: LUCENE-8696 > URL: https://issues.apache.org/jira/browse/LUCENE-8696 > Project: Lucene - Core > Issue Type: Bug > Components: modules/spatial3d >Reporter: Ignacio Vera > Assignee: Karl Wright >Priority: Major > Attachments: LUCENE-8696.patch > > > Reproduce with: > {code:java} > ant test -Dtestcase=TestGeo3DPoint -Dtests.method=testGeo3DRelations > -Dtests.seed=721195D0198A8470 -Dtests.slow=true -Dtests.badapples=true > -Dtests.locale=sr-RS -Dtests.timezone=Europe/Istanbul -Dtests.asserts=true > -Dtests.file.encoding=ISO-8859-1{code} > Error: > {code:java} > [junit4] FAILURE 1.16s | TestGeo3DPoint.testGeo3DRelations <<< > [junit4] > Throwable #1: java.lang.AssertionError: invalid hits for > shape=GeoStandardPath: {planetmodel=PlanetModel.WGS84, > width=1.3439035240356338(77.01), > points={[[lat=2.4457272005608357E-47, > lon=0.017453291479645996([X=1.0009663787601641, Y=0.017471932090601616, > Z=2.448463612203698E-47])], [lat=2.4457272005608357E-47, > lon=0.8952476719156919([X=0.6260252093310985, Y=0.7812370940381473, > Z=2.448463612203698E-47])], [lat=2.4457272005608357E-47, > lon=0.6491968536639036([X=0.7974608400583222, Y=0.6052232384770843, > Z=2.448463612203698E-47])], [lat=-0.7718789008737459, > lon=0.9236607495528212([X=0.43181767034308555, Y=0.5714183775701452, > Z=-0.6971214014446648])]]}}{code} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-8696) TestGeo3DPoint.testGeo3DRelations failure
[ https://issues.apache.org/jira/browse/LUCENE-8696?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16775839#comment-16775839 ] Karl Wright commented on LUCENE-8696: - Preliminary results indicate that the problem may be due to the fact that the point isn't on the surface. The following test fails: {code} GeoPoint check = new GeoPoint(0.02071783020158524, 0.9523290535474472, 0.30699177256064203); assertTrue(PlanetModel.WGS84.pointOnSurface(check)); {code} Because path geometry uses surface circles and parallel slicing planes, they can be particularly susceptible to misconstruing membership for points that are off the world. I'll try to confirm this picture. > TestGeo3DPoint.testGeo3DRelations failure > - > > Key: LUCENE-8696 > URL: https://issues.apache.org/jira/browse/LUCENE-8696 > Project: Lucene - Core > Issue Type: Bug > Components: modules/spatial3d >Reporter: Ignacio Vera > Assignee: Karl Wright >Priority: Major > Attachments: LUCENE-8696.patch > > > Reproduce with: > {code:java} > ant test -Dtestcase=TestGeo3DPoint -Dtests.method=testGeo3DRelations > -Dtests.seed=721195D0198A8470 -Dtests.slow=true -Dtests.badapples=true > -Dtests.locale=sr-RS -Dtests.timezone=Europe/Istanbul -Dtests.asserts=true > -Dtests.file.encoding=ISO-8859-1{code} > Error: > {code:java} > [junit4] FAILURE 1.16s | TestGeo3DPoint.testGeo3DRelations <<< > [junit4] > Throwable #1: java.lang.AssertionError: invalid hits for > shape=GeoStandardPath: {planetmodel=PlanetModel.WGS84, > width=1.3439035240356338(77.01), > points={[[lat=2.4457272005608357E-47, > lon=0.017453291479645996([X=1.0009663787601641, Y=0.017471932090601616, > Z=2.448463612203698E-47])], [lat=2.4457272005608357E-47, > lon=0.8952476719156919([X=0.6260252093310985, Y=0.7812370940381473, > Z=2.448463612203698E-47])], [lat=2.4457272005608357E-47, > lon=0.6491968536639036([X=0.7974608400583222, Y=0.6052232384770843, > Z=2.448463612203698E-47])], [lat=-0.7718789008737459, > lon=0.9236607495528212([X=0.43181767034308555, Y=0.5714183775701452, > Z=-0.6971214014446648])]]}}{code} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-8696) TestGeo3DPoint.testGeo3DRelations failure
[ https://issues.apache.org/jira/browse/LUCENE-8696?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16775050#comment-16775050 ] Karl Wright commented on LUCENE-8696: - The path in the test retraces its steps, but that should not be a problem for membership testing. I'll look into it starting this evening. > TestGeo3DPoint.testGeo3DRelations failure > - > > Key: LUCENE-8696 > URL: https://issues.apache.org/jira/browse/LUCENE-8696 > Project: Lucene - Core > Issue Type: Bug > Components: modules/spatial3d >Reporter: Ignacio Vera > Assignee: Karl Wright >Priority: Major > Attachments: LUCENE-8696.patch > > > Reproduce with: > {code:java} > ant test -Dtestcase=TestGeo3DPoint -Dtests.method=testGeo3DRelations > -Dtests.seed=721195D0198A8470 -Dtests.slow=true -Dtests.badapples=true > -Dtests.locale=sr-RS -Dtests.timezone=Europe/Istanbul -Dtests.asserts=true > -Dtests.file.encoding=ISO-8859-1{code} > Error: > {code:java} > [junit4] FAILURE 1.16s | TestGeo3DPoint.testGeo3DRelations <<< > [junit4] > Throwable #1: java.lang.AssertionError: invalid hits for > shape=GeoStandardPath: {planetmodel=PlanetModel.WGS84, > width=1.3439035240356338(77.01), > points={[[lat=2.4457272005608357E-47, > lon=0.017453291479645996([X=1.0009663787601641, Y=0.017471932090601616, > Z=2.448463612203698E-47])], [lat=2.4457272005608357E-47, > lon=0.8952476719156919([X=0.6260252093310985, Y=0.7812370940381473, > Z=2.448463612203698E-47])], [lat=2.4457272005608357E-47, > lon=0.6491968536639036([X=0.7974608400583222, Y=0.6052232384770843, > Z=2.448463612203698E-47])], [lat=-0.7718789008737459, > lon=0.9236607495528212([X=0.43181767034308555, Y=0.5714183775701452, > Z=-0.6971214014446648])]]}}{code} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-8696) TestGeo3DPoint.testGeo3DRelations failure
[ https://issues.apache.org/jira/browse/LUCENE-8696?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16771682#comment-16771682 ] Karl Wright commented on LUCENE-8696: - [~ivera], would you be willing to construct a simple test case? I can't possibly look at this until the weekend, but it would help. > TestGeo3DPoint.testGeo3DRelations failure > - > > Key: LUCENE-8696 > URL: https://issues.apache.org/jira/browse/LUCENE-8696 > Project: Lucene - Core > Issue Type: Bug > Components: modules/spatial3d >Reporter: Ignacio Vera > Assignee: Karl Wright >Priority: Major > > Reproduce with: > {code:java} > ant test -Dtestcase=TestGeo3DPoint -Dtests.method=testGeo3DRelations > -Dtests.seed=721195D0198A8470 -Dtests.slow=true -Dtests.badapples=true > -Dtests.locale=sr-RS -Dtests.timezone=Europe/Istanbul -Dtests.asserts=true > -Dtests.file.encoding=ISO-8859-1{code} > Error: > {code:java} > [junit4] FAILURE 1.16s | TestGeo3DPoint.testGeo3DRelations <<< > [junit4] > Throwable #1: java.lang.AssertionError: invalid hits for > shape=GeoStandardPath: {planetmodel=PlanetModel.WGS84, > width=1.3439035240356338(77.01), > points={[[lat=2.4457272005608357E-47, > lon=0.017453291479645996([X=1.0009663787601641, Y=0.017471932090601616, > Z=2.448463612203698E-47])], [lat=2.4457272005608357E-47, > lon=0.8952476719156919([X=0.6260252093310985, Y=0.7812370940381473, > Z=2.448463612203698E-47])], [lat=2.4457272005608357E-47, > lon=0.6491968536639036([X=0.7974608400583222, Y=0.6052232384770843, > Z=2.448463612203698E-47])], [lat=-0.7718789008737459, > lon=0.9236607495528212([X=0.43181767034308555, Y=0.5714183775701452, > Z=-0.6971214014446648])]]}}{code} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Assigned] (LUCENE-8696) TestGeo3DPoint.testGeo3DRelations failure
[ https://issues.apache.org/jira/browse/LUCENE-8696?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karl Wright reassigned LUCENE-8696: --- Assignee: Karl Wright > TestGeo3DPoint.testGeo3DRelations failure > - > > Key: LUCENE-8696 > URL: https://issues.apache.org/jira/browse/LUCENE-8696 > Project: Lucene - Core > Issue Type: Bug > Components: modules/spatial3d >Reporter: Ignacio Vera > Assignee: Karl Wright >Priority: Major > > Reproduce with: > {code:java} > ant test -Dtestcase=TestGeo3DPoint -Dtests.method=testGeo3DRelations > -Dtests.seed=721195D0198A8470 -Dtests.slow=true -Dtests.badapples=true > -Dtests.locale=sr-RS -Dtests.timezone=Europe/Istanbul -Dtests.asserts=true > -Dtests.file.encoding=ISO-8859-1{code} > Error: > {code:java} > [junit4] FAILURE 1.16s | TestGeo3DPoint.testGeo3DRelations <<< > [junit4] > Throwable #1: java.lang.AssertionError: invalid hits for > shape=GeoStandardPath: {planetmodel=PlanetModel.WGS84, > width=1.3439035240356338(77.01), > points={[[lat=2.4457272005608357E-47, > lon=0.017453291479645996([X=1.0009663787601641, Y=0.017471932090601616, > Z=2.448463612203698E-47])], [lat=2.4457272005608357E-47, > lon=0.8952476719156919([X=0.6260252093310985, Y=0.7812370940381473, > Z=2.448463612203698E-47])], [lat=2.4457272005608357E-47, > lon=0.6491968536639036([X=0.7974608400583222, Y=0.6052232384770843, > Z=2.448463612203698E-47])], [lat=-0.7718789008737459, > lon=0.9236607495528212([X=0.43181767034308555, Y=0.5714183775701452, > Z=-0.6971214014446648])]]}}{code} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: Congratulations to the new Lucene/Solr PMC chair, Cassandra Targett
Congratulations! On Mon, Dec 31, 2018 at 5:20 PM Michael Sokolov wrote: > Heavy is the head that wears the crown - congrats and thank you! And > here's to a peaceful transition of power in the new year :) > > On Mon, Dec 31, 2018 at 1:39 PM Dawid Weiss wrote: > > > > Congratulations, Cassandra! > > > > On Mon, Dec 31, 2018 at 7:04 PM Gus Heck wrote: > > > > > > Congratulations :) > > > > > > On Mon, Dec 31, 2018, 12:48 PM Alexandre Rafalovitch < > arafa...@gmail.com wrote: > > >> > > >> Congratulations. > > >> > > >> Regards, > > >>Alex > > >> > > >> On Mon, 31 Dec 2018 at 11:31, David Smiley > wrote: > > >> > > > >> > Congrats Cassandra! > > >> > > > >> > On Mon, Dec 31, 2018 at 11:28 AM Erick Erickson < > erickerick...@gmail.com> wrote: > > >> >> > > >> >> Congrats Cassandra1 > > >> >> > > >> >> On Sun, Dec 30, 2018 at 11:38 PM Adrien Grand > wrote: > > >> >> > > > >> >> > Every year, the Lucene PMC rotates the Lucene PMC chair and > Apache > > >> >> > Vice President position. > > >> >> > > > >> >> > This year we have nominated and elected Cassandra Targett as the > > >> >> > chair, a decision that the board approved in its December 2018 > > >> >> > meeting. > > >> >> > > > >> >> > Congratulations, Cassandra! > > >> >> > > > >> >> > -- > > >> >> > Adrien > > >> >> > > > >> >> > > - > > >> >> > To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org > > >> >> > For additional commands, e-mail: dev-h...@lucene.apache.org > > >> >> > > > >> >> > > >> >> > - > > >> >> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org > > >> >> For additional commands, e-mail: dev-h...@lucene.apache.org > > >> >> > > >> > -- > > >> > Lucene/Solr Search Committer (PMC), Developer, Author, Speaker > > >> > LinkedIn: http://linkedin.com/in/davidwsmiley | Book: > http://www.solrenterprisesearchserver.com > > >> > > >> - > > >> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org > > >> For additional commands, e-mail: dev-h...@lucene.apache.org > > >> > > > > - > > To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org > > For additional commands, e-mail: dev-h...@lucene.apache.org > > > > - > To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org > For additional commands, e-mail: dev-h...@lucene.apache.org > >
[jira] [Commented] (LUCENE-8587) Self comparison bug in GeoComplexPolygon.equals method
[ https://issues.apache.org/jira/browse/LUCENE-8587?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16717159#comment-16717159 ] Karl Wright commented on LUCENE-8587: - Thinking about it, it seems safest to me to serialize and deserialize all five GeoPoint values -- lat, lon, x, y, z. If that's done then no modifications would be needed to GeoStandardCircle and GeoExactCircle, and we wouldn't need to guess at whether it's all going to work. The downside is that the serialized size is going to grow by a factor of 2 -- but that may not be horrible. > Self comparison bug in GeoComplexPolygon.equals method > -- > > Key: LUCENE-8587 > URL: https://issues.apache.org/jira/browse/LUCENE-8587 > Project: Lucene - Core > Issue Type: Bug > Components: modules/spatial3d >Affects Versions: 7.1 >Reporter: Zsolt Gyulavari >Assignee: Karl Wright >Priority: Major > Attachments: LUCENE-8587.patch > > > GeoComplexPolygon.equals method checks equality with own testPoint1 field > instead of the other.testPoint1. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-8587) Self comparison bug in GeoComplexPolygon.equals method
[ https://issues.apache.org/jira/browse/LUCENE-8587?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16717068#comment-16717068 ] Karl Wright commented on LUCENE-8587: - It appears GeoStandardCircle and GeoExactCircle require lat/lon as arguments, so in order to make this work I'd need to make some changes there as well, including adding constructors that accept GeoPoints. I'm also a bit queasy about the fact that after deserialization the point methods getLatitude() and getLongitude() will return different values than they would before serialization. I don't see any obvious place where this might blow up but it will take more analysis to be sure. > Self comparison bug in GeoComplexPolygon.equals method > -- > > Key: LUCENE-8587 > URL: https://issues.apache.org/jira/browse/LUCENE-8587 > Project: Lucene - Core > Issue Type: Bug > Components: modules/spatial3d >Affects Versions: 7.1 >Reporter: Zsolt Gyulavari >Assignee: Karl Wright >Priority: Major > Attachments: LUCENE-8587.patch > > > GeoComplexPolygon.equals method checks equality with own testPoint1 field > instead of the other.testPoint1. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-8587) Self comparison bug in GeoComplexPolygon.equals method
[ https://issues.apache.org/jira/browse/LUCENE-8587?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16717035#comment-16717035 ] Karl Wright commented on LUCENE-8587: - What I'd like to do is change the GeoPoint serialization and deserialization to save the (x,y,z) tuples rather than the (lat,lon) ones: {code} @Override public void write(final OutputStream outputStream) throws IOException { SerializableObject.writeDouble(outputStream, x); SerializableObject.writeDouble(outputStream, y); SerializableObject.writeDouble(outputStream, z); } {code} and {code} public GeoPoint(final PlanetModel planetModel, final InputStream inputStream) throws IOException { // Note: this relies on left-right parameter execution order!! Much code depends on that though and // it is apparently in a java spec: https://stackoverflow.com/questions/2201688/order-of-execution-of-parameters-guarantees-in-java this(planetModel, SerializableObject.readDouble(inputStream), SerializableObject.readDouble(inputStream), SerializableObject.readDouble(inputStream)); } {code} This is not a backwards compatible change, however, so we could make it only in master and not pull it up to the 7.x and 6.x branches. [~ivera], what do you think? > Self comparison bug in GeoComplexPolygon.equals method > -- > > Key: LUCENE-8587 > URL: https://issues.apache.org/jira/browse/LUCENE-8587 > Project: Lucene - Core > Issue Type: Bug > Components: modules/spatial3d >Affects Versions: 7.1 >Reporter: Zsolt Gyulavari >Assignee: Karl Wright >Priority: Major > Attachments: LUCENE-8587.patch > > > GeoComplexPolygon.equals method checks equality with own testPoint1 field > instead of the other.testPoint1. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-8587) Self comparison bug in GeoComplexPolygon.equals method
[ https://issues.apache.org/jira/browse/LUCENE-8587?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16717020#comment-16717020 ] Karl Wright commented on LUCENE-8587: - Ok, you're right, this is more complex. We cannot do without the testpoint and the in/out of set boolean, even though moving these around might produce exactly the same polygon. On the other hand, blaming the serialization of the testpoint also seems odd since it's basically preserved from the constructor in whatever form was there. Perhaps serialization/deserialization of the geopoint needs to change. Let me examine that next. > Self comparison bug in GeoComplexPolygon.equals method > -- > > Key: LUCENE-8587 > URL: https://issues.apache.org/jira/browse/LUCENE-8587 > Project: Lucene - Core > Issue Type: Bug > Components: modules/spatial3d >Affects Versions: 7.1 >Reporter: Zsolt Gyulavari >Assignee: Karl Wright >Priority: Major > Attachments: LUCENE-8587.patch > > > GeoComplexPolygon.equals method checks equality with own testPoint1 field > instead of the other.testPoint1. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-8587) Self comparison bug in GeoComplexPolygon.equals method
[ https://issues.apache.org/jira/browse/LUCENE-8587?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16717004#comment-16717004 ] Karl Wright commented on LUCENE-8587: - {quote} Maybe we should build the point here using the equivalent [lat, lon] {quote} [~ivera] No, that makes no sense. Polygons are never constructed using (x,y,z) coordinates; they are always constructed using lat/lon points and a planet model. If the lat/lons are the same you won't get different x,y,z points, period. So there's something else being done wrong, and I think the problem is probably the random number generator construction of the testpoint. The testpoint should *not* be included in the equals computation for that reason. I will commit a fix. > Self comparison bug in GeoComplexPolygon.equals method > -- > > Key: LUCENE-8587 > URL: https://issues.apache.org/jira/browse/LUCENE-8587 > Project: Lucene - Core > Issue Type: Bug > Components: modules/spatial3d >Affects Versions: 7.1 >Reporter: Zsolt Gyulavari >Assignee: Karl Wright >Priority: Major > Attachments: LUCENE-8587.patch > > > GeoComplexPolygon.equals method checks equality with own testPoint1 field > instead of the other.testPoint1. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Assigned] (LUCENE-8587) Self comparison bug in GeoComplexPolygon.equals method
[ https://issues.apache.org/jira/browse/LUCENE-8587?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karl Wright reassigned LUCENE-8587: --- Assignee: Karl Wright > Self comparison bug in GeoComplexPolygon.equals method > -- > > Key: LUCENE-8587 > URL: https://issues.apache.org/jira/browse/LUCENE-8587 > Project: Lucene - Core > Issue Type: Bug > Components: modules/spatial3d >Affects Versions: 7.1 >Reporter: Zsolt Gyulavari >Assignee: Karl Wright >Priority: Major > Attachments: LUCENE-8587.patch > > > GeoComplexPolygon.equals method checks equality with own testPoint1 field > instead of the other.testPoint1. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: Welcome Tim Allison as a Lucene/Solr committer
Welcome! Karl On Mon, Nov 5, 2018 at 1:39 PM Christine Poerschke (BLOOMBERG/ LONDON) < cpoersc...@bloomberg.net> wrote: > Welcome Tim! > > From: dev@lucene.apache.org At: 11/02/18 16:20:52 > To: dev@lucene.apache.org > Subject: Welcome Tim Allison as a Lucene/Solr committer > > Hi all, > > > Please join me in welcoming Tim Allison as the latest Lucene/Solr committer! > > Congratulations and Welcome, Tim! > > It's traditional for you to introduce yourself with a brief bio. > > Erick > > - > To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org > For additional commands, e-mail: dev-h...@lucene.apache.org > > >
Re: Welcome Gus Heck as Lucene/Solr committer
Welcome!! Karl On Thu, Nov 1, 2018 at 9:53 PM Koji Sekiguchi wrote: > Welcome Gus! > > Koji > > On 2018/11/01 21:22, David Smiley wrote: > > Hi all, > > > > Please join me in welcoming Gus Heck as the latest Lucene/Solr committer! > > > > Congratulations and Welcome, Gus! > > > > Gus, it's traditional for you to introduce yourself with a brief bio. > > > > ~ David > > -- > > Lucene/Solr Search Committer, Consultant, Developer, Author, Speaker > > LinkedIn: http://linkedin.com/in/davidwsmiley | Book: > http://www.solrenterprisesearchserver.com > > - > To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org > For additional commands, e-mail: dev-h...@lucene.apache.org > >
[jira] [Commented] (LUCENE-8540) Geo3d quantization test failure for MAX/MIN encoding values
[ https://issues.apache.org/jira/browse/LUCENE-8540?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16670641#comment-16670641 ] Karl Wright commented on LUCENE-8540: - [~ivera] Looks reasonable as far as I can tell. The question is whether the decode scaling factor is 'correct' but I think changing that will cause people to need to reindex, so this is a better fix. > Geo3d quantization test failure for MAX/MIN encoding values > --- > > Key: LUCENE-8540 > URL: https://issues.apache.org/jira/browse/LUCENE-8540 > Project: Lucene - Core > Issue Type: Bug > Components: modules/spatial3d >Reporter: Ignacio Vera >Assignee: Ignacio Vera >Priority: Major > Attachments: LUCENE-8540.patch > > > Here is a reproducible error: > {code:java} > 08:45:21[junit4] Suite: org.apache.lucene.spatial3d.TestGeo3DPoint > 08:45:21[junit4] IGNOR/A 0.01s J1 | TestGeo3DPoint.testRandomBig > 08:45:21[junit4]> Assumption #1: 'nightly' test group is disabled > (@Nightly()) > 08:45:21[junit4] 2> NOTE: reproduce with: ant test > -Dtestcase=TestGeo3DPoint -Dtests.method=testQuantization > -Dtests.seed=4CB20CF248F6211 -Dtests.slow=true -Dtests.badapples=true > -Dtests.locale=ga-IE -Dtests.timezone=America/Bogota -Dtests.asserts=true > -Dtests.file.encoding=US-ASCII > 08:45:21[junit4] ERROR 0.20s J1 | TestGeo3DPoint.testQuantization <<< > 08:45:21[junit4]> Throwable #1: java.lang.IllegalArgumentException: > value=-1.0011188543037526 is out-of-bounds (less than than WGS84's > -planetMax=-1.0011188539924791) > 08:45:21[junit4]> at > __randomizedtesting.SeedInfo.seed([4CB20CF248F6211:32220FD9326E7F33]:0) > 08:45:21[junit4]> at > org.apache.lucene.spatial3d.Geo3DUtil.encodeValue(Geo3DUtil.java:56) > 08:45:21[junit4]> at > org.apache.lucene.spatial3d.TestGeo3DPoint.testQuantization(TestGeo3DPoint.java:1228) > 08:45:21[junit4]> at java.lang.Thread.run(Thread.java:748) > 08:45:21[junit4] 2> NOTE: test params are: codec=Asserting(Lucene70): > {id=PostingsFormat(name=LuceneVarGapDocFreqInterval)}, > docValues:{id=DocValuesFormat(name=Asserting), > point=DocValuesFormat(name=Lucene70)}, maxPointsInLeafNode=659, > maxMBSortInHeap=6.225981846119071, sim=RandomSimilarity(queryNorm=false): {}, > locale=ga-IE, timezone=America/Bogota > 08:45:21[junit4] 2> NOTE: Linux 2.6.32-754.6.3.el6.x86_64 amd64/Oracle > Corporation 1.8.0_181 > (64-bit)/cpus=16,threads=1,free=466116320,total=536346624 > 08:45:21[junit4] 2> NOTE: All tests run in this JVM: [GeoPointTest, > RandomGeoPolygonTest, TestGeo3DPoint] > 08:45:21[junit4] Completed [18/18 (1!)] on J1 in 19.83s, 14 tests, 1 > error, 1 skipped <<< FAILURES!{code} > > It seems this test will fail if encoding = Geo3DUtil.MIN_ENCODED_VALUE or > encoding = Geo3DUtil.MAX_ENCODED_VALUE. > It is related with https://issues.apache.org/jira/browse/LUCENE-7327 > > > -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-8540) Geo3d quantization test failure for MAX/MIN encoding values
[ https://issues.apache.org/jira/browse/LUCENE-8540?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16660515#comment-16660515 ] Karl Wright commented on LUCENE-8540: - Hi [~ivera], can you have a look at this? I'm quite busy today unfortunately. > Geo3d quantization test failure for MAX/MIN encoding values > --- > > Key: LUCENE-8540 > URL: https://issues.apache.org/jira/browse/LUCENE-8540 > Project: Lucene - Core > Issue Type: Bug > Components: modules/spatial3d >Reporter: Ignacio Vera >Priority: Major > > Here is a reproducible error: > {code:java} > 08:45:21[junit4] Suite: org.apache.lucene.spatial3d.TestGeo3DPoint > 08:45:21[junit4] IGNOR/A 0.01s J1 | TestGeo3DPoint.testRandomBig > 08:45:21[junit4]> Assumption #1: 'nightly' test group is disabled > (@Nightly()) > 08:45:21[junit4] 2> NOTE: reproduce with: ant test > -Dtestcase=TestGeo3DPoint -Dtests.method=testQuantization > -Dtests.seed=4CB20CF248F6211 -Dtests.slow=true -Dtests.badapples=true > -Dtests.locale=ga-IE -Dtests.timezone=America/Bogota -Dtests.asserts=true > -Dtests.file.encoding=US-ASCII > 08:45:21[junit4] ERROR 0.20s J1 | TestGeo3DPoint.testQuantization <<< > 08:45:21[junit4]> Throwable #1: java.lang.IllegalArgumentException: > value=-1.0011188543037526 is out-of-bounds (less than than WGS84's > -planetMax=-1.0011188539924791) > 08:45:21[junit4]> at > __randomizedtesting.SeedInfo.seed([4CB20CF248F6211:32220FD9326E7F33]:0) > 08:45:21[junit4]> at > org.apache.lucene.spatial3d.Geo3DUtil.encodeValue(Geo3DUtil.java:56) > 08:45:21[junit4]> at > org.apache.lucene.spatial3d.TestGeo3DPoint.testQuantization(TestGeo3DPoint.java:1228) > 08:45:21[junit4]> at java.lang.Thread.run(Thread.java:748) > 08:45:21[junit4] 2> NOTE: test params are: codec=Asserting(Lucene70): > {id=PostingsFormat(name=LuceneVarGapDocFreqInterval)}, > docValues:{id=DocValuesFormat(name=Asserting), > point=DocValuesFormat(name=Lucene70)}, maxPointsInLeafNode=659, > maxMBSortInHeap=6.225981846119071, sim=RandomSimilarity(queryNorm=false): {}, > locale=ga-IE, timezone=America/Bogota > 08:45:21[junit4] 2> NOTE: Linux 2.6.32-754.6.3.el6.x86_64 amd64/Oracle > Corporation 1.8.0_181 > (64-bit)/cpus=16,threads=1,free=466116320,total=536346624 > 08:45:21[junit4] 2> NOTE: All tests run in this JVM: [GeoPointTest, > RandomGeoPolygonTest, TestGeo3DPoint] > 08:45:21[junit4] Completed [18/18 (1!)] on J1 in 19.83s, 14 tests, 1 > error, 1 skipped <<< FAILURES!{code} > > It seems this test will fail if encoding = Geo3DUtil.MIN_ENCODED_VALUE or > encoding = Geo3DUtil.MAX_ENCODED_VALUE. > It is related with https://issues.apache.org/jira/browse/LUCENE-7327 > > > -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Assigned] (LUCENE-8540) Geo3d quantization test failure for MAX/MIN encoding values
[ https://issues.apache.org/jira/browse/LUCENE-8540?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karl Wright reassigned LUCENE-8540: --- Assignee: Ignacio Vera > Geo3d quantization test failure for MAX/MIN encoding values > --- > > Key: LUCENE-8540 > URL: https://issues.apache.org/jira/browse/LUCENE-8540 > Project: Lucene - Core > Issue Type: Bug > Components: modules/spatial3d >Reporter: Ignacio Vera >Assignee: Ignacio Vera >Priority: Major > > Here is a reproducible error: > {code:java} > 08:45:21[junit4] Suite: org.apache.lucene.spatial3d.TestGeo3DPoint > 08:45:21[junit4] IGNOR/A 0.01s J1 | TestGeo3DPoint.testRandomBig > 08:45:21[junit4]> Assumption #1: 'nightly' test group is disabled > (@Nightly()) > 08:45:21[junit4] 2> NOTE: reproduce with: ant test > -Dtestcase=TestGeo3DPoint -Dtests.method=testQuantization > -Dtests.seed=4CB20CF248F6211 -Dtests.slow=true -Dtests.badapples=true > -Dtests.locale=ga-IE -Dtests.timezone=America/Bogota -Dtests.asserts=true > -Dtests.file.encoding=US-ASCII > 08:45:21[junit4] ERROR 0.20s J1 | TestGeo3DPoint.testQuantization <<< > 08:45:21[junit4]> Throwable #1: java.lang.IllegalArgumentException: > value=-1.0011188543037526 is out-of-bounds (less than than WGS84's > -planetMax=-1.0011188539924791) > 08:45:21[junit4]> at > __randomizedtesting.SeedInfo.seed([4CB20CF248F6211:32220FD9326E7F33]:0) > 08:45:21[junit4]> at > org.apache.lucene.spatial3d.Geo3DUtil.encodeValue(Geo3DUtil.java:56) > 08:45:21[junit4]> at > org.apache.lucene.spatial3d.TestGeo3DPoint.testQuantization(TestGeo3DPoint.java:1228) > 08:45:21[junit4]> at java.lang.Thread.run(Thread.java:748) > 08:45:21[junit4] 2> NOTE: test params are: codec=Asserting(Lucene70): > {id=PostingsFormat(name=LuceneVarGapDocFreqInterval)}, > docValues:{id=DocValuesFormat(name=Asserting), > point=DocValuesFormat(name=Lucene70)}, maxPointsInLeafNode=659, > maxMBSortInHeap=6.225981846119071, sim=RandomSimilarity(queryNorm=false): {}, > locale=ga-IE, timezone=America/Bogota > 08:45:21[junit4] 2> NOTE: Linux 2.6.32-754.6.3.el6.x86_64 amd64/Oracle > Corporation 1.8.0_181 > (64-bit)/cpus=16,threads=1,free=466116320,total=536346624 > 08:45:21[junit4] 2> NOTE: All tests run in this JVM: [GeoPointTest, > RandomGeoPolygonTest, TestGeo3DPoint] > 08:45:21[junit4] Completed [18/18 (1!)] on J1 in 19.83s, 14 tests, 1 > error, 1 skipped <<< FAILURES!{code} > > It seems this test will fail if encoding = Geo3DUtil.MIN_ENCODED_VALUE or > encoding = Geo3DUtil.MAX_ENCODED_VALUE. > It is related with https://issues.apache.org/jira/browse/LUCENE-7327 > > > -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-8522) Spatial: Polygon touching the negative boundaries of WGS84 fails on Solr
[ https://issues.apache.org/jira/browse/LUCENE-8522?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16641762#comment-16641762 ] Karl Wright commented on LUCENE-8522: - [~ivera], looks good to me. > Spatial: Polygon touching the negative boundaries of WGS84 fails on Solr > > > Key: LUCENE-8522 > URL: https://issues.apache.org/jira/browse/LUCENE-8522 > Project: Lucene - Core > Issue Type: Bug > Components: modules/spatial3d >Affects Versions: 7.4, 7.5, master (8.0) >Reporter: Ema Panz >Assignee: Ignacio Vera >Priority: Critical > Attachments: LUCENE-8522.patch > > > When using the WGS84 coordinates system and querying with a polygon touching > one of the "negative" borders, Solr throws a "NullPointerException" error. > The query is performed with the "intersect" function over a GeoJson polygon > specified with the coordinates: > { "coordinates":[[[-180,90],[-180,-90],[180,-90],[180,90],[-180,90]]] } > > The queried field has been defined as: > {code:java} > class="solr.SpatialRecursivePrefixTreeFieldType" >spatialContextFactory="Geo3D" >geo="true" >planetModel="WGS84" >format="GeoJSON" > />{code} > > {code:java} > java.lang.NullPointerException > at > org.apache.lucene.spatial.spatial4j.Geo3dShape.getBoundingBox(Geo3dShape.java:114) > at > org.apache.lucene.spatial.query.SpatialArgs.calcDistanceFromErrPct(SpatialArgs.java:63) > at > org.apache.lucene.spatial.query.SpatialArgs.resolveDistErr(SpatialArgs.java:84) > at > org.apache.lucene.spatial.prefix.RecursivePrefixTreeStrategy.makeQuery(RecursivePrefixTreeStrategy.java:182) > at > org.apache.solr.schema.AbstractSpatialFieldType.getQueryFromSpatialArgs(AbstractSpatialFieldType.java:368) > at > org.apache.solr.schema.AbstractSpatialFieldType.getFieldQuery(AbstractSpatialFieldType.java:340) > at > org.apache.solr.search.FieldQParserPlugin$1.parse(FieldQParserPlugin.java:45) > at org.apache.solr.search.QParser.getQuery(QParser.java:169) > at > org.apache.solr.handler.component.QueryComponent.prepare(QueryComponent.java:207) > at > org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:272) > at > org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:199) > at org.apache.solr.core.SolrCore.execute(SolrCore.java:2539) > at org.apache.solr.servlet.HttpSolrCall.execute(HttpSolrCall.java:709) > at org.apache.solr.servlet.HttpSolrCall.call(HttpSolrCall.java:515) > at > org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:377) > at > org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:323) > at > org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1634) > at > org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:533) > at > org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:146) > at > org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:548) > at > org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:132) > at > org.eclipse.jetty.server.handler.ScopedHandler.nextHandle(ScopedHandler.java:257) > at > org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:1595) > at > org.eclipse.jetty.server.handler.ScopedHandler.nextHandle(ScopedHandler.java:255) > at > org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1253) > at > org.eclipse.jetty.server.handler.ScopedHandler.nextScope(ScopedHandler.java:203) > at > org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:473) > at > org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:1564) > at > org.eclipse.jetty.server.handler.ScopedHandler.nextScope(ScopedHandler.java:201) > at > org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1155) > at > org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:144) > at > org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:219) > at > org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerCollection.java:126) > at > org.eclipse.jetty.server.hand
[jira] [Commented] (LUCENE-8522) Spatial: Polygon touching the negative boundaries of WGS84 fails on Solr
[ https://issues.apache.org/jira/browse/LUCENE-8522?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16639708#comment-16639708 ] Karl Wright commented on LUCENE-8522: - [~ivera] It's actually in the GeoPolygonFactory contract that null is returned for polygons that cannot be constructed: {code} /** Create a GeoPolygon using the specified points and holes, using order to determine * siding of the polygon. Much like ESRI, this method uses clockwise to indicate the space * on the same side of the shape as being inside, and counter-clockwise to indicate the * space on the opposite side as being inside. * @param description describes the polygon and its associated holes. If points go * clockwise from a given pole, then that pole should be within the polygon. If points go * counter-clockwise, then that pole should be outside the polygon. * @param leniencyValue is the maximum distance (in units) that a point can be from the plane and still be considered as * belonging to the plane. Any value greater than zero may cause some of the provided points that are in fact outside * the strict definition of co-planarity, but are within this distance, to be discarded for the purposes of creating a * "safe" polygon. * @return a GeoPolygon corresponding to what was specified, or null if a valid polygon cannot be generated * from this input. */ {code} So I think we ought to leave this alone and change the spatial4j wrapper to throw the exception. > Spatial: Polygon touching the negative boundaries of WGS84 fails on Solr > > > Key: LUCENE-8522 > URL: https://issues.apache.org/jira/browse/LUCENE-8522 > Project: Lucene - Core > Issue Type: Bug > Components: modules/spatial3d >Affects Versions: 7.4, 7.5 >Reporter: Ema Panz >Assignee: Ignacio Vera >Priority: Critical > > When using the WGS84 coordinates system and querying with a polygon touching > one of the "negative" borders, Solr throws a "NullPointerException" error. > The query is performed with the "intersect" function over a GeoJson polygon > specified with the coordinates: > { "coordinates":[[[-180,90],[-180,-90],[180,-90],[180,90],[-180,90]]] } > > The queried field has been defined as: > {code:java} > class="solr.SpatialRecursivePrefixTreeFieldType" >spatialContextFactory="Geo3D" >geo="true" >planetModel="WGS84" >format="GeoJSON" > />{code} > > {code:java} > java.lang.NullPointerException > at > org.apache.lucene.spatial.spatial4j.Geo3dShape.getBoundingBox(Geo3dShape.java:114) > at > org.apache.lucene.spatial.query.SpatialArgs.calcDistanceFromErrPct(SpatialArgs.java:63) > at > org.apache.lucene.spatial.query.SpatialArgs.resolveDistErr(SpatialArgs.java:84) > at > org.apache.lucene.spatial.prefix.RecursivePrefixTreeStrategy.makeQuery(RecursivePrefixTreeStrategy.java:182) > at > org.apache.solr.schema.AbstractSpatialFieldType.getQueryFromSpatialArgs(AbstractSpatialFieldType.java:368) > at > org.apache.solr.schema.AbstractSpatialFieldType.getFieldQuery(AbstractSpatialFieldType.java:340) > at > org.apache.solr.search.FieldQParserPlugin$1.parse(FieldQParserPlugin.java:45) > at org.apache.solr.search.QParser.getQuery(QParser.java:169) > at > org.apache.solr.handler.component.QueryComponent.prepare(QueryComponent.java:207) > at > org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:272) > at > org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:199) > at org.apache.solr.core.SolrCore.execute(SolrCore.java:2539) > at org.apache.solr.servlet.HttpSolrCall.execute(HttpSolrCall.java:709) > at org.apache.solr.servlet.HttpSolrCall.call(HttpSolrCall.java:515) > at > org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:377) > at > org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:323) > at > org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1634) > at > org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:533) > at > org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:146) > at > org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:548) > at > org.eclipse.jetty.server.handler.HandlerWrapper.handle
[jira] [Commented] (LUCENE-8522) Spatial: Polygon touching the negative boundaries of WGS84 fails on Solr
[ https://issues.apache.org/jira/browse/LUCENE-8522?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16639702#comment-16639702 ] Karl Wright commented on LUCENE-8522: - [~ivera] I agree that the factory should never return null; it should return IllegalArgumentException instead. This would have to be a fix in the factory. But that sounds like a trivial change, no? > Spatial: Polygon touching the negative boundaries of WGS84 fails on Solr > > > Key: LUCENE-8522 > URL: https://issues.apache.org/jira/browse/LUCENE-8522 > Project: Lucene - Core > Issue Type: Bug > Components: modules/spatial3d >Affects Versions: 7.4, 7.5 >Reporter: Ema Panz >Assignee: Ignacio Vera >Priority: Critical > > When using the WGS84 coordinates system and querying with a polygon touching > one of the "negative" borders, Solr throws a "NullPointerException" error. > The query is performed with the "intersect" function over a GeoJson polygon > specified with the coordinates: > { "coordinates":[[[-180,90],[-180,-90],[180,-90],[180,90],[-180,90]]] } > > The queried field has been defined as: > {code:java} > class="solr.SpatialRecursivePrefixTreeFieldType" >spatialContextFactory="Geo3D" >geo="true" >planetModel="WGS84" >format="GeoJSON" > />{code} > > {code:java} > java.lang.NullPointerException > at > org.apache.lucene.spatial.spatial4j.Geo3dShape.getBoundingBox(Geo3dShape.java:114) > at > org.apache.lucene.spatial.query.SpatialArgs.calcDistanceFromErrPct(SpatialArgs.java:63) > at > org.apache.lucene.spatial.query.SpatialArgs.resolveDistErr(SpatialArgs.java:84) > at > org.apache.lucene.spatial.prefix.RecursivePrefixTreeStrategy.makeQuery(RecursivePrefixTreeStrategy.java:182) > at > org.apache.solr.schema.AbstractSpatialFieldType.getQueryFromSpatialArgs(AbstractSpatialFieldType.java:368) > at > org.apache.solr.schema.AbstractSpatialFieldType.getFieldQuery(AbstractSpatialFieldType.java:340) > at > org.apache.solr.search.FieldQParserPlugin$1.parse(FieldQParserPlugin.java:45) > at org.apache.solr.search.QParser.getQuery(QParser.java:169) > at > org.apache.solr.handler.component.QueryComponent.prepare(QueryComponent.java:207) > at > org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:272) > at > org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:199) > at org.apache.solr.core.SolrCore.execute(SolrCore.java:2539) > at org.apache.solr.servlet.HttpSolrCall.execute(HttpSolrCall.java:709) > at org.apache.solr.servlet.HttpSolrCall.call(HttpSolrCall.java:515) > at > org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:377) > at > org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:323) > at > org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1634) > at > org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:533) > at > org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:146) > at > org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:548) > at > org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:132) > at > org.eclipse.jetty.server.handler.ScopedHandler.nextHandle(ScopedHandler.java:257) > at > org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:1595) > at > org.eclipse.jetty.server.handler.ScopedHandler.nextHandle(ScopedHandler.java:255) > at > org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1253) > at > org.eclipse.jetty.server.handler.ScopedHandler.nextScope(ScopedHandler.java:203) > at > org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:473) > at > org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:1564) > at > org.eclipse.jetty.server.handler.ScopedHandler.nextScope(ScopedHandler.java:201) > at > org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1155) > at > org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:144) > at > org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:219) > at > org.eclipse
[jira] [Commented] (LUCENE-8522) Spatial: Polygon touching the negative boundaries of WGS84 fails on Solr
[ https://issues.apache.org/jira/browse/LUCENE-8522?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16638458#comment-16638458 ] Karl Wright commented on LUCENE-8522: - [~ivera], I think this is your code? > Spatial: Polygon touching the negative boundaries of WGS84 fails on Solr > > > Key: LUCENE-8522 > URL: https://issues.apache.org/jira/browse/LUCENE-8522 > Project: Lucene - Core > Issue Type: Bug > Components: modules/spatial3d >Affects Versions: 7.4, 7.5 >Reporter: Ema Panz >Priority: Critical > > When using the WGS84 coordinates system and querying with a polygon touching > one of the "negative" borders, Solr throws a "NullPointerException" error. > The query is performed with the "intersect" function over a GeoJson polygon > specified with the coordinates: > { "coordinates":[[[-180,90],[-180,-90],[180,-90],[180,90],[-180,90]]] } > > The queried field has been defined as: > {code:java} > class="solr.SpatialRecursivePrefixTreeFieldType" >spatialContextFactory="Geo3D" >geo="true" >planetModel="WGS84" >format="GeoJSON" > />{code} > > {code:java} > java.lang.NullPointerException > at > org.apache.lucene.spatial.spatial4j.Geo3dShape.getBoundingBox(Geo3dShape.java:114) > at > org.apache.lucene.spatial.query.SpatialArgs.calcDistanceFromErrPct(SpatialArgs.java:63) > at > org.apache.lucene.spatial.query.SpatialArgs.resolveDistErr(SpatialArgs.java:84) > at > org.apache.lucene.spatial.prefix.RecursivePrefixTreeStrategy.makeQuery(RecursivePrefixTreeStrategy.java:182) > at > org.apache.solr.schema.AbstractSpatialFieldType.getQueryFromSpatialArgs(AbstractSpatialFieldType.java:368) > at > org.apache.solr.schema.AbstractSpatialFieldType.getFieldQuery(AbstractSpatialFieldType.java:340) > at > org.apache.solr.search.FieldQParserPlugin$1.parse(FieldQParserPlugin.java:45) > at org.apache.solr.search.QParser.getQuery(QParser.java:169) > at > org.apache.solr.handler.component.QueryComponent.prepare(QueryComponent.java:207) > at > org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:272) > at > org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:199) > at org.apache.solr.core.SolrCore.execute(SolrCore.java:2539) > at org.apache.solr.servlet.HttpSolrCall.execute(HttpSolrCall.java:709) > at org.apache.solr.servlet.HttpSolrCall.call(HttpSolrCall.java:515) > at > org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:377) > at > org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:323) > at > org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1634) > at > org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:533) > at > org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:146) > at > org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:548) > at > org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:132) > at > org.eclipse.jetty.server.handler.ScopedHandler.nextHandle(ScopedHandler.java:257) > at > org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:1595) > at > org.eclipse.jetty.server.handler.ScopedHandler.nextHandle(ScopedHandler.java:255) > at > org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1253) > at > org.eclipse.jetty.server.handler.ScopedHandler.nextScope(ScopedHandler.java:203) > at > org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:473) > at > org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:1564) > at > org.eclipse.jetty.server.handler.ScopedHandler.nextScope(ScopedHandler.java:201) > at > org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1155) > at > org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:144) > at > org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:219) > at > org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerCollection.java:126) > at > org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:132) > at org.eclipse.jetty.ser
[jira] [Comment Edited] (SOLR-12798) Structural changes in SolrJ since version 7.0.0 have effectively disabled multipart post
[ https://issues.apache.org/jira/browse/SOLR-12798?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16633256#comment-16633256 ] Karl Wright edited comment on SOLR-12798 at 9/30/18 6:02 AM: - [~elyograg] {quote} I would suggest that you don't do this. At all. Tika is prone to OOM and JVM crashes, as Julien Massiera already noted. {quote} It's not a very good citizen running inside ManifoldCF either. We have ability to use the external service version but really that just offshores the problem. But I agree it's better to keep user-facing services alive if one can. For backwards compatibility reasons, we will need to continue to support this mode of operation, but we'll recommend against it, and consider changing our defaults accordingly as well. FWIW, we've been steadily pushing tickets into the Tika queue and issues are getting addressed. That's really the best long-term solution. was (Author: kwri...@metacarta.com): [~elyograg] {quote} I would suggest that you don't do this. At all. Tika is prone to OOM and JVM crashes, as Julien Massiera already noted. {quote} It's not a very good citizen running inside ManifoldCF either. We have ability to use the external service version but really that just offshores the problem. But I agree it's better to keep user-facing services alive if one can. For backwards compatibility reasons, we will need to continue to support this mode of operation, but we'll recommend against it, and change our defaults accordingly as well. FWIW, we've been steadily pushing tickets into the Tika queue and issues are getting addressed. That's really the best long-term solution. > Structural changes in SolrJ since version 7.0.0 have effectively disabled > multipart post > > > Key: SOLR-12798 > URL: https://issues.apache.org/jira/browse/SOLR-12798 > Project: Solr > Issue Type: Improvement > Security Level: Public(Default Security Level. Issues are Public) > Components: SolrJ >Affects Versions: 7.4 >Reporter: Karl Wright >Assignee: Karl Wright >Priority: Major > Attachments: HOT Balloon Trip_Ultra HD.jpg, > SOLR-12798-approach.patch, SOLR-12798-reproducer.patch, > SOLR-12798-workaround.patch, SOLR-12798.patch, no params in url.png, > solr-update-request.txt > > > Project ManifoldCF uses SolrJ to post documents to Solr. When upgrading from > SolrJ 7.0.x to SolrJ 7.4, we encountered significant structural changes to > SolrJ's HttpSolrClient class that seemingly disable any use of multipart > post. This is critical because ManifoldCF's documents often contain metadata > in excess of 4K that therefore cannot be stuffed into a URL. > The changes in question seem to have been performed by Paul Noble on > 10/31/2017, with the introduction of the RequestWriter mechanism. Basically, > if a request has a RequestWriter, it is used exclusively to write the > request, and that overrides the stream mechanism completely. I haven't > chased it back to a specific ticket. > ManifoldCF's usage of SolrJ involves the creation of > ContentStreamUpdateRequests for all posts meant for Solr Cell, and the > creation of UpdateRequests for posts not meant for Solr Cell (as well as for > delete and commit requests). For our release cycle that is taking place > right now, we're shipping a modified version of HttpSolrClient that ignores > the RequestWriter when dealing with ContentStreamUpdateRequests. We > apparently cannot use multipart for all requests because on the Solr side we > get "pfountz Should not get here!" errors on the Solr side when we do, which > generate HTTP error code 500 responses. That should not happen either, in my > opinion. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-12798) Structural changes in SolrJ since version 7.0.0 have effectively disabled multipart post
[ https://issues.apache.org/jira/browse/SOLR-12798?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16633256#comment-16633256 ] Karl Wright commented on SOLR-12798: [~elyograg] {quote} I would suggest that you don't do this. At all. Tika is prone to OOM and JVM crashes, as Julien Massiera already noted. {quote} It's not a very good citizen running inside ManifoldCF either. We have ability to use the external service version but really that just offshores the problem. But I agree it's better to keep user-facing services alive if one can. For backwards compatibility reasons, we will need to continue to support this mode of operation, but we'll recommend against it, and change our defaults accordingly as well. FWIW, we've been steadily pushing tickets into the Tika queue and issues are getting addressed. That's really the best long-term solution. > Structural changes in SolrJ since version 7.0.0 have effectively disabled > multipart post > > > Key: SOLR-12798 > URL: https://issues.apache.org/jira/browse/SOLR-12798 > Project: Solr > Issue Type: Improvement > Security Level: Public(Default Security Level. Issues are Public) > Components: SolrJ >Affects Versions: 7.4 >Reporter: Karl Wright >Assignee: Karl Wright >Priority: Major > Attachments: HOT Balloon Trip_Ultra HD.jpg, > SOLR-12798-approach.patch, SOLR-12798-reproducer.patch, > SOLR-12798-workaround.patch, SOLR-12798.patch, no params in url.png, > solr-update-request.txt > > > Project ManifoldCF uses SolrJ to post documents to Solr. When upgrading from > SolrJ 7.0.x to SolrJ 7.4, we encountered significant structural changes to > SolrJ's HttpSolrClient class that seemingly disable any use of multipart > post. This is critical because ManifoldCF's documents often contain metadata > in excess of 4K that therefore cannot be stuffed into a URL. > The changes in question seem to have been performed by Paul Noble on > 10/31/2017, with the introduction of the RequestWriter mechanism. Basically, > if a request has a RequestWriter, it is used exclusively to write the > request, and that overrides the stream mechanism completely. I haven't > chased it back to a specific ticket. > ManifoldCF's usage of SolrJ involves the creation of > ContentStreamUpdateRequests for all posts meant for Solr Cell, and the > creation of UpdateRequests for posts not meant for Solr Cell (as well as for > delete and commit requests). For our release cycle that is taking place > right now, we're shipping a modified version of HttpSolrClient that ignores > the RequestWriter when dealing with ContentStreamUpdateRequests. We > apparently cannot use multipart for all requests because on the Solr side we > get "pfountz Should not get here!" errors on the Solr side when we do, which > generate HTTP error code 500 responses. That should not happen either, in my > opinion. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-12798) Structural changes in SolrJ since version 7.0.0 have effectively disabled multipart post
[ https://issues.apache.org/jira/browse/SOLR-12798?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16633252#comment-16633252 ] Karl Wright commented on SOLR-12798: [~mkhludnev] Ugly hack has been voted on and shipped. Hopefully by next round (December) there's a better way though. > Structural changes in SolrJ since version 7.0.0 have effectively disabled > multipart post > > > Key: SOLR-12798 > URL: https://issues.apache.org/jira/browse/SOLR-12798 > Project: Solr > Issue Type: Improvement > Security Level: Public(Default Security Level. Issues are Public) > Components: SolrJ >Affects Versions: 7.4 >Reporter: Karl Wright >Assignee: Karl Wright >Priority: Major > Attachments: HOT Balloon Trip_Ultra HD.jpg, > SOLR-12798-approach.patch, SOLR-12798-reproducer.patch, > SOLR-12798-workaround.patch, SOLR-12798.patch, no params in url.png, > solr-update-request.txt > > > Project ManifoldCF uses SolrJ to post documents to Solr. When upgrading from > SolrJ 7.0.x to SolrJ 7.4, we encountered significant structural changes to > SolrJ's HttpSolrClient class that seemingly disable any use of multipart > post. This is critical because ManifoldCF's documents often contain metadata > in excess of 4K that therefore cannot be stuffed into a URL. > The changes in question seem to have been performed by Paul Noble on > 10/31/2017, with the introduction of the RequestWriter mechanism. Basically, > if a request has a RequestWriter, it is used exclusively to write the > request, and that overrides the stream mechanism completely. I haven't > chased it back to a specific ticket. > ManifoldCF's usage of SolrJ involves the creation of > ContentStreamUpdateRequests for all posts meant for Solr Cell, and the > creation of UpdateRequests for posts not meant for Solr Cell (as well as for > delete and commit requests). For our release cycle that is taking place > right now, we're shipping a modified version of HttpSolrClient that ignores > the RequestWriter when dealing with ContentStreamUpdateRequests. We > apparently cannot use multipart for all requests because on the Solr side we > get "pfountz Should not get here!" errors on the Solr side when we do, which > generate HTTP error code 500 responses. That should not happen either, in my > opinion. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-12798) Structural changes in SolrJ since version 7.0.0 have effectively disabled multipart post
[ https://issues.apache.org/jira/browse/SOLR-12798?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16631149#comment-16631149 ] Karl Wright commented on SOLR-12798: [~janhoy]: {quote} That would be for case 1) where you don't do Tika stuff on the MCF side but want Solr to handle the binary stream. In this case there should be no problem with huge metadata request params. And I agree that SolrJ should support this case (ContentStreamUpdateRequest?). {quote} Ok. At the moment that sort of request seems to be transmitted with standard POST with metadata stuffed into the URL. So a fix is needed for that. {code} I got confused by your other use case where you parse the file with Tika on the MCF side and still sent the text to /extract {code} While Julien has a custom Solr handler, that's not what we typically do, and we recommend that already-Tika-extracted content and metadata be sent to the /update handler. In that case, we build a SolrInputDocument from the content stream, and add it into an UpdateRequest. This mode of usage also seems to use standard POST or even PUT, and it puts all the metadata parameters on the URL. This is transmitted to the /update handler. Do you want to support the case where the metadata parameters are sizable enough that the URL exceeds 8192 bytes? > Structural changes in SolrJ since version 7.0.0 have effectively disabled > multipart post > > > Key: SOLR-12798 > URL: https://issues.apache.org/jira/browse/SOLR-12798 > Project: Solr > Issue Type: Improvement > Security Level: Public(Default Security Level. Issues are Public) > Components: SolrJ >Affects Versions: 7.4 >Reporter: Karl Wright >Assignee: Karl Wright >Priority: Major > Attachments: HOT Balloon Trip_Ultra HD.jpg, > SOLR-12798-approach.patch, SOLR-12798-reproducer.patch, no params in url.png, > solr-update-request.txt > > > Project ManifoldCF uses SolrJ to post documents to Solr. When upgrading from > SolrJ 7.0.x to SolrJ 7.4, we encountered significant structural changes to > SolrJ's HttpSolrClient class that seemingly disable any use of multipart > post. This is critical because ManifoldCF's documents often contain metadata > in excess of 4K that therefore cannot be stuffed into a URL. > The changes in question seem to have been performed by Paul Noble on > 10/31/2017, with the introduction of the RequestWriter mechanism. Basically, > if a request has a RequestWriter, it is used exclusively to write the > request, and that overrides the stream mechanism completely. I haven't > chased it back to a specific ticket. > ManifoldCF's usage of SolrJ involves the creation of > ContentStreamUpdateRequests for all posts meant for Solr Cell, and the > creation of UpdateRequests for posts not meant for Solr Cell (as well as for > delete and commit requests). For our release cycle that is taking place > right now, we're shipping a modified version of HttpSolrClient that ignores > the RequestWriter when dealing with ContentStreamUpdateRequests. We > apparently cannot use multipart for all requests because on the Solr side we > get "pfountz Should not get here!" errors on the Solr side when we do, which > generate HTTP error code 500 responses. That should not happen either, in my > opinion. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-12798) Structural changes in SolrJ since version 7.0.0 have effectively disabled multipart post
[ https://issues.apache.org/jira/browse/SOLR-12798?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16631084#comment-16631084 ] Karl Wright commented on SOLR-12798: [~janhoy], if you didn't mean that the metadata and content should be sent in the content body, then I'm completely missing what your suggestion is. {quote} My cURL examples were just to discus what "metadata" might mean in this context. {quote} Repositories that are crawled by ManifoldCF have documents that are represented as follows: - A long content stream, binary - N pairs of name/value data, called metadata, which is fielded data associated with the document If the metadata is extracted in a ManifoldCF pipeline from the content stream, it's done via Tika, from a binary stream, which changes the binary content stream to a simple text stream, and also supplies more metadata generated as a result of the extraction. In other words, your JSON example is not like anything we do at all at this time. If you want this translated into CURL, you can do it one of two ways: (1) Put the metadata onto the URL as & parameters, e.g. name1=value1=value2 etc, or (2) Send the metadata as sections in a multipart post. This too can be set up in CURL if you want me to propose an example. Each section in a multipart post has a name, and you can thus transmit a section for every metadata name/value pair, as well as one for the content part (which has its own name, that is in fact used by SolrCell for metadata of its own.) Hope this helps. > Structural changes in SolrJ since version 7.0.0 have effectively disabled > multipart post > > > Key: SOLR-12798 > URL: https://issues.apache.org/jira/browse/SOLR-12798 > Project: Solr > Issue Type: Improvement > Security Level: Public(Default Security Level. Issues are Public) > Components: SolrJ > Affects Versions: 7.4 > Reporter: Karl Wright >Assignee: Karl Wright >Priority: Major > Attachments: HOT Balloon Trip_Ultra HD.jpg, > SOLR-12798-approach.patch, SOLR-12798-reproducer.patch, no params in url.png, > solr-update-request.txt > > > Project ManifoldCF uses SolrJ to post documents to Solr. When upgrading from > SolrJ 7.0.x to SolrJ 7.4, we encountered significant structural changes to > SolrJ's HttpSolrClient class that seemingly disable any use of multipart > post. This is critical because ManifoldCF's documents often contain metadata > in excess of 4K that therefore cannot be stuffed into a URL. > The changes in question seem to have been performed by Paul Noble on > 10/31/2017, with the introduction of the RequestWriter mechanism. Basically, > if a request has a RequestWriter, it is used exclusively to write the > request, and that overrides the stream mechanism completely. I haven't > chased it back to a specific ticket. > ManifoldCF's usage of SolrJ involves the creation of > ContentStreamUpdateRequests for all posts meant for Solr Cell, and the > creation of UpdateRequests for posts not meant for Solr Cell (as well as for > delete and commit requests). For our release cycle that is taking place > right now, we're shipping a modified version of HttpSolrClient that ignores > the RequestWriter when dealing with ContentStreamUpdateRequests. We > apparently cannot use multipart for all requests because on the Solr side we > get "pfountz Should not get here!" errors on the Solr side when we do, which > generate HTTP error code 500 responses. That should not happen either, in my > opinion. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Comment Edited] (SOLR-12798) Structural changes in SolrJ since version 7.0.0 have effectively disabled multipart post
[ https://issues.apache.org/jira/browse/SOLR-12798?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16631059#comment-16631059 ] Karl Wright edited comment on SOLR-12798 at 9/27/18 9:31 PM: - [~janhoy], so your suggestion is to use JSON format for the body, and put the metadata into that. How do you suggest we handle binary data that is meant for SolrCell? Encoding the binary in a JSON document is possible but in practice this is quite verbose, yielding 3 or 4 bytes to one. Is that nevertheless your official suggestion? Also, how do you force SolrJ to transmit the right mime type to Solr, as well as the document name field (which SolrCell cares about), if you use JSON encoding? I assume that you have to signal this somehow? The code seems to get the mime type from the Request, but it's not set anywhere by the user, so I presume this is either set by default or there is some way to set it? was (Author: kwri...@metacarta.com): [~janhoy], so your suggestion is to use JSON format for the body, and put the metadata into that. How do you suggest we handle binary data that is meant for SolrCell? Encoding the binary in a JSON document is possible but in practice this is quite verbose, yielding 3 or 4 bytes to one. Is that nevertheless your official suggestion? > Structural changes in SolrJ since version 7.0.0 have effectively disabled > multipart post > > > Key: SOLR-12798 > URL: https://issues.apache.org/jira/browse/SOLR-12798 > Project: Solr > Issue Type: Improvement > Security Level: Public(Default Security Level. Issues are Public) > Components: SolrJ >Affects Versions: 7.4 >Reporter: Karl Wright >Assignee: Karl Wright >Priority: Major > Attachments: HOT Balloon Trip_Ultra HD.jpg, > SOLR-12798-approach.patch, SOLR-12798-reproducer.patch, no params in url.png, > solr-update-request.txt > > > Project ManifoldCF uses SolrJ to post documents to Solr. When upgrading from > SolrJ 7.0.x to SolrJ 7.4, we encountered significant structural changes to > SolrJ's HttpSolrClient class that seemingly disable any use of multipart > post. This is critical because ManifoldCF's documents often contain metadata > in excess of 4K that therefore cannot be stuffed into a URL. > The changes in question seem to have been performed by Paul Noble on > 10/31/2017, with the introduction of the RequestWriter mechanism. Basically, > if a request has a RequestWriter, it is used exclusively to write the > request, and that overrides the stream mechanism completely. I haven't > chased it back to a specific ticket. > ManifoldCF's usage of SolrJ involves the creation of > ContentStreamUpdateRequests for all posts meant for Solr Cell, and the > creation of UpdateRequests for posts not meant for Solr Cell (as well as for > delete and commit requests). For our release cycle that is taking place > right now, we're shipping a modified version of HttpSolrClient that ignores > the RequestWriter when dealing with ContentStreamUpdateRequests. We > apparently cannot use multipart for all requests because on the Solr side we > get "pfountz Should not get here!" errors on the Solr side when we do, which > generate HTTP error code 500 responses. That should not happen either, in my > opinion. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-12798) Structural changes in SolrJ since version 7.0.0 have effectively disabled multipart post
[ https://issues.apache.org/jira/browse/SOLR-12798?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16631059#comment-16631059 ] Karl Wright commented on SOLR-12798: [~janhoy], so your suggestion is to use JSON format for the body, and put the metadata into that. How do you suggest we handle binary data that is meant for SolrCell? Encoding the binary in a JSON document is possible but in practice this is quite verbose, yielding 3 or 4 bytes to one. Is that nevertheless your official suggestion? > Structural changes in SolrJ since version 7.0.0 have effectively disabled > multipart post > > > Key: SOLR-12798 > URL: https://issues.apache.org/jira/browse/SOLR-12798 > Project: Solr > Issue Type: Improvement > Security Level: Public(Default Security Level. Issues are Public) > Components: SolrJ >Affects Versions: 7.4 >Reporter: Karl Wright >Assignee: Karl Wright >Priority: Major > Attachments: HOT Balloon Trip_Ultra HD.jpg, > SOLR-12798-approach.patch, no params in url.png, solr-update-request.txt > > > Project ManifoldCF uses SolrJ to post documents to Solr. When upgrading from > SolrJ 7.0.x to SolrJ 7.4, we encountered significant structural changes to > SolrJ's HttpSolrClient class that seemingly disable any use of multipart > post. This is critical because ManifoldCF's documents often contain metadata > in excess of 4K that therefore cannot be stuffed into a URL. > The changes in question seem to have been performed by Paul Noble on > 10/31/2017, with the introduction of the RequestWriter mechanism. Basically, > if a request has a RequestWriter, it is used exclusively to write the > request, and that overrides the stream mechanism completely. I haven't > chased it back to a specific ticket. > ManifoldCF's usage of SolrJ involves the creation of > ContentStreamUpdateRequests for all posts meant for Solr Cell, and the > creation of UpdateRequests for posts not meant for Solr Cell (as well as for > delete and commit requests). For our release cycle that is taking place > right now, we're shipping a modified version of HttpSolrClient that ignores > the RequestWriter when dealing with ContentStreamUpdateRequests. We > apparently cannot use multipart for all requests because on the Solr side we > get "pfountz Should not get here!" errors on the Solr side when we do, which > generate HTTP error code 500 responses. That should not happen either, in my > opinion. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-12798) Structural changes in SolrJ since version 7.0.0 have effectively disabled multipart post
[ https://issues.apache.org/jira/browse/SOLR-12798?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16631057#comment-16631057 ] Karl Wright commented on SOLR-12798: [~mkhludnev], your walkthrough in the code is fine but (a) when we use ContentStreamUpdateHandler in the manner you describe to the update/extract handler, we still wind up going through the contentWriter clause above where you stop, and (b) when we use UpdateHandler in the manner you describe we also go through that same path. In fact I could find no way to send the content through any other path with the code as it exists in master right now, because in our usage there's always a contentWriter and the check for its presence excludes all else that happens after that. So I don't understand where the disconnect is. Perhaps if you attach the exact code you are testing we can resolve this. > Structural changes in SolrJ since version 7.0.0 have effectively disabled > multipart post > > > Key: SOLR-12798 > URL: https://issues.apache.org/jira/browse/SOLR-12798 > Project: Solr > Issue Type: Improvement > Security Level: Public(Default Security Level. Issues are Public) > Components: SolrJ >Affects Versions: 7.4 >Reporter: Karl Wright >Assignee: Karl Wright >Priority: Major > Attachments: HOT Balloon Trip_Ultra HD.jpg, > SOLR-12798-approach.patch, no params in url.png, solr-update-request.txt > > > Project ManifoldCF uses SolrJ to post documents to Solr. When upgrading from > SolrJ 7.0.x to SolrJ 7.4, we encountered significant structural changes to > SolrJ's HttpSolrClient class that seemingly disable any use of multipart > post. This is critical because ManifoldCF's documents often contain metadata > in excess of 4K that therefore cannot be stuffed into a URL. > The changes in question seem to have been performed by Paul Noble on > 10/31/2017, with the introduction of the RequestWriter mechanism. Basically, > if a request has a RequestWriter, it is used exclusively to write the > request, and that overrides the stream mechanism completely. I haven't > chased it back to a specific ticket. > ManifoldCF's usage of SolrJ involves the creation of > ContentStreamUpdateRequests for all posts meant for Solr Cell, and the > creation of UpdateRequests for posts not meant for Solr Cell (as well as for > delete and commit requests). For our release cycle that is taking place > right now, we're shipping a modified version of HttpSolrClient that ignores > the RequestWriter when dealing with ContentStreamUpdateRequests. We > apparently cannot use multipart for all requests because on the Solr side we > get "pfountz Should not get here!" errors on the Solr side when we do, which > generate HTTP error code 500 responses. That should not happen either, in my > opinion. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-12798) Structural changes in SolrJ since version 7.0.0 have effectively disabled multipart post
[ https://issues.apache.org/jira/browse/SOLR-12798?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16630851#comment-16630851 ] Karl Wright commented on SOLR-12798: Please examine the following code from master HttpSolrClient.java: {code} if(contentWriter != null) { String fullQueryUrl = url + wparams.toQueryString(); HttpEntityEnclosingRequestBase postOrPut = SolrRequest.METHOD.POST == request.getMethod() ?new HttpPost(fullQueryUrl) : new HttpPut(fullQueryUrl); postOrPut.addHeader("Content-Type", contentWriter.getContentType()); postOrPut.setEntity(new BasicHttpEntity(){ @Override public boolean isStreaming() { return true; } @Override public void writeTo(OutputStream outstream) throws IOException { contentWriter.write(outstream); } }); return postOrPut; } else if (streams == null || isMultipart) { {code} The request is formed by taking all the parameters in wparams (which include the metadata fields AFAICT) and putting them into the URL: {code} HttpEntityEnclosingRequestBase postOrPut = SolrRequest.METHOD.POST == request.getMethod() ?new HttpPost(fullQueryUrl) : new HttpPut(fullQueryUrl); {code} There is no other way in the SolrJ request handling code for PUT and POST requests to transmit metadata to Solr. Indeed, right now, both documents added to an UpdateRequest, as well as documents that are specified via ContentStreamUpdateRequest, go by this route. We did verify that using the 7.5.0 version of SolrJ and completely removing all ManifoldCF custom code led to documents that would exceed the maximum URL length if their metadata was long enough. > Structural changes in SolrJ since version 7.0.0 have effectively disabled > multipart post > > > Key: SOLR-12798 > URL: https://issues.apache.org/jira/browse/SOLR-12798 > Project: Solr > Issue Type: Improvement > Security Level: Public(Default Security Level. Issues are Public) > Components: SolrJ >Affects Versions: 7.4 > Reporter: Karl Wright >Assignee: Karl Wright >Priority: Major > Attachments: HOT Balloon Trip_Ultra HD.jpg, > SOLR-12798-approach.patch, solr-update-request.txt > > > Project ManifoldCF uses SolrJ to post documents to Solr. When upgrading from > SolrJ 7.0.x to SolrJ 7.4, we encountered significant structural changes to > SolrJ's HttpSolrClient class that seemingly disable any use of multipart > post. This is critical because ManifoldCF's documents often contain metadata > in excess of 4K that therefore cannot be stuffed into a URL. > The changes in question seem to have been performed by Paul Noble on > 10/31/2017, with the introduction of the RequestWriter mechanism. Basically, > if a request has a RequestWriter, it is used exclusively to write the > request, and that overrides the stream mechanism completely. I haven't > chased it back to a specific ticket. > ManifoldCF's usage of SolrJ involves the creation of > ContentStreamUpdateRequests for all posts meant for Solr Cell, and the > creation of UpdateRequests for posts not meant for Solr Cell (as well as for > delete and commit requests). For our release cycle that is taking place > right now, we're shipping a modified version of HttpSolrClient that ignores > the RequestWriter when dealing with ContentStreamUpdateRequests. We > apparently cannot use multipart for all requests because on the Solr side we > get "pfountz Should not get here!" errors on the Solr side when we do, which > generate HTTP error code 500 responses. That should not happen either, in my > opinion. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-12798) Structural changes in SolrJ since version 7.0.0 have effectively disabled multipart post
[ https://issues.apache.org/jira/browse/SOLR-12798?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16630531#comment-16630531 ] Karl Wright commented on SOLR-12798: {quote} This looks to me like a plain Solr document post to /update handler, in whatever format you'd like? If you can take adavantage of Noble Paul's enhancements to stream the content this can still be a plain document not needing multipart, and no need sending data in http params? {quote} The streaming part is great. But if you look at the current master implementation of HttpSolrClient, you will note that all parameters and metadata are folded into the URL for the ContentWriter transmission mechanism. This fails for us because the URL size can easily exceed 8192 bytes. That is why we need the multipart post handling even for UpdateRequest/SolrInputDocument requests. > Structural changes in SolrJ since version 7.0.0 have effectively disabled > multipart post > > > Key: SOLR-12798 > URL: https://issues.apache.org/jira/browse/SOLR-12798 > Project: Solr > Issue Type: Improvement > Security Level: Public(Default Security Level. Issues are Public) > Components: SolrJ >Affects Versions: 7.4 >Reporter: Karl Wright >Assignee: Karl Wright >Priority: Major > Attachments: HOT Balloon Trip_Ultra HD.jpg, > SOLR-12798-approach.patch, solr-update-request.txt > > > Project ManifoldCF uses SolrJ to post documents to Solr. When upgrading from > SolrJ 7.0.x to SolrJ 7.4, we encountered significant structural changes to > SolrJ's HttpSolrClient class that seemingly disable any use of multipart > post. This is critical because ManifoldCF's documents often contain metadata > in excess of 4K that therefore cannot be stuffed into a URL. > The changes in question seem to have been performed by Paul Noble on > 10/31/2017, with the introduction of the RequestWriter mechanism. Basically, > if a request has a RequestWriter, it is used exclusively to write the > request, and that overrides the stream mechanism completely. I haven't > chased it back to a specific ticket. > ManifoldCF's usage of SolrJ involves the creation of > ContentStreamUpdateRequests for all posts meant for Solr Cell, and the > creation of UpdateRequests for posts not meant for Solr Cell (as well as for > delete and commit requests). For our release cycle that is taking place > right now, we're shipping a modified version of HttpSolrClient that ignores > the RequestWriter when dealing with ContentStreamUpdateRequests. We > apparently cannot use multipart for all requests because on the Solr side we > get "pfountz Should not get here!" errors on the Solr side when we do, which > generate HTTP error code 500 responses. That should not happen either, in my > opinion. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Comment Edited] (SOLR-12798) Structural changes in SolrJ since version 7.0.0 have effectively disabled multipart post
[ https://issues.apache.org/jira/browse/SOLR-12798?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16630203#comment-16630203 ] Karl Wright edited comment on SOLR-12798 at 9/27/18 11:04 AM: -- [~janhoy], the example we provided is using type (1) output configuration, as Julien noted. Do you want a type (2) example? It will not change the need for multipart post. was (Author: kwri...@metacarta.com): [~janhoy], the example we provided is using type (1), as Julien noted. Do you want a type (2) example? > Structural changes in SolrJ since version 7.0.0 have effectively disabled > multipart post > > > Key: SOLR-12798 > URL: https://issues.apache.org/jira/browse/SOLR-12798 > Project: Solr > Issue Type: Improvement > Security Level: Public(Default Security Level. Issues are Public) > Components: SolrJ >Affects Versions: 7.4 >Reporter: Karl Wright >Assignee: Karl Wright >Priority: Major > Attachments: HOT Balloon Trip_Ultra HD.jpg, > SOLR-12798-approach.patch, solr-update-request.txt > > > Project ManifoldCF uses SolrJ to post documents to Solr. When upgrading from > SolrJ 7.0.x to SolrJ 7.4, we encountered significant structural changes to > SolrJ's HttpSolrClient class that seemingly disable any use of multipart > post. This is critical because ManifoldCF's documents often contain metadata > in excess of 4K that therefore cannot be stuffed into a URL. > The changes in question seem to have been performed by Paul Noble on > 10/31/2017, with the introduction of the RequestWriter mechanism. Basically, > if a request has a RequestWriter, it is used exclusively to write the > request, and that overrides the stream mechanism completely. I haven't > chased it back to a specific ticket. > ManifoldCF's usage of SolrJ involves the creation of > ContentStreamUpdateRequests for all posts meant for Solr Cell, and the > creation of UpdateRequests for posts not meant for Solr Cell (as well as for > delete and commit requests). For our release cycle that is taking place > right now, we're shipping a modified version of HttpSolrClient that ignores > the RequestWriter when dealing with ContentStreamUpdateRequests. We > apparently cannot use multipart for all requests because on the Solr side we > get "pfountz Should not get here!" errors on the Solr side when we do, which > generate HTTP error code 500 responses. That should not happen either, in my > opinion. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-12798) Structural changes in SolrJ since version 7.0.0 have effectively disabled multipart post
[ https://issues.apache.org/jira/browse/SOLR-12798?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16630203#comment-16630203 ] Karl Wright commented on SOLR-12798: [~janhoy], the example we provided is using type (1), as Julien noted. Do you want a type (2) example? > Structural changes in SolrJ since version 7.0.0 have effectively disabled > multipart post > > > Key: SOLR-12798 > URL: https://issues.apache.org/jira/browse/SOLR-12798 > Project: Solr > Issue Type: Improvement > Security Level: Public(Default Security Level. Issues are Public) > Components: SolrJ >Affects Versions: 7.4 >Reporter: Karl Wright >Assignee: Karl Wright >Priority: Major > Attachments: HOT Balloon Trip_Ultra HD.jpg, > SOLR-12798-approach.patch, solr-update-request.txt > > > Project ManifoldCF uses SolrJ to post documents to Solr. When upgrading from > SolrJ 7.0.x to SolrJ 7.4, we encountered significant structural changes to > SolrJ's HttpSolrClient class that seemingly disable any use of multipart > post. This is critical because ManifoldCF's documents often contain metadata > in excess of 4K that therefore cannot be stuffed into a URL. > The changes in question seem to have been performed by Paul Noble on > 10/31/2017, with the introduction of the RequestWriter mechanism. Basically, > if a request has a RequestWriter, it is used exclusively to write the > request, and that overrides the stream mechanism completely. I haven't > chased it back to a specific ticket. > ManifoldCF's usage of SolrJ involves the creation of > ContentStreamUpdateRequests for all posts meant for Solr Cell, and the > creation of UpdateRequests for posts not meant for Solr Cell (as well as for > delete and commit requests). For our release cycle that is taking place > right now, we're shipping a modified version of HttpSolrClient that ignores > the RequestWriter when dealing with ContentStreamUpdateRequests. We > apparently cannot use multipart for all requests because on the Solr side we > get "pfountz Should not get here!" errors on the Solr side when we do, which > generate HTTP error code 500 responses. That should not happen either, in my > opinion. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-12798) Structural changes in SolrJ since version 7.0.0 have effectively disabled multipart post
[ https://issues.apache.org/jira/browse/SOLR-12798?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16630133#comment-16630133 ] Karl Wright commented on SOLR-12798: Hi [~janhoy], typically for case (2) the /update handler is used, not the /update/extract handler. > Structural changes in SolrJ since version 7.0.0 have effectively disabled > multipart post > > > Key: SOLR-12798 > URL: https://issues.apache.org/jira/browse/SOLR-12798 > Project: Solr > Issue Type: Improvement > Security Level: Public(Default Security Level. Issues are Public) > Components: SolrJ >Affects Versions: 7.4 >Reporter: Karl Wright >Assignee: Karl Wright >Priority: Major > Attachments: HOT Balloon Trip_Ultra HD.jpg, > SOLR-12798-approach.patch, solr-update-request.txt > > > Project ManifoldCF uses SolrJ to post documents to Solr. When upgrading from > SolrJ 7.0.x to SolrJ 7.4, we encountered significant structural changes to > SolrJ's HttpSolrClient class that seemingly disable any use of multipart > post. This is critical because ManifoldCF's documents often contain metadata > in excess of 4K that therefore cannot be stuffed into a URL. > The changes in question seem to have been performed by Paul Noble on > 10/31/2017, with the introduction of the RequestWriter mechanism. Basically, > if a request has a RequestWriter, it is used exclusively to write the > request, and that overrides the stream mechanism completely. I haven't > chased it back to a specific ticket. > ManifoldCF's usage of SolrJ involves the creation of > ContentStreamUpdateRequests for all posts meant for Solr Cell, and the > creation of UpdateRequests for posts not meant for Solr Cell (as well as for > delete and commit requests). For our release cycle that is taking place > right now, we're shipping a modified version of HttpSolrClient that ignores > the RequestWriter when dealing with ContentStreamUpdateRequests. We > apparently cannot use multipart for all requests because on the Solr side we > get "pfountz Should not get here!" errors on the Solr side when we do, which > generate HTTP error code 500 responses. That should not happen either, in my > opinion. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Comment Edited] (SOLR-12798) Structural changes in SolrJ since version 7.0.0 have effectively disabled multipart post
[ https://issues.apache.org/jira/browse/SOLR-12798?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16629860#comment-16629860 ] Karl Wright edited comment on SOLR-12798 at 9/27/18 7:35 AM: - I should also note that other prime examples of this issue *cannot* be added to this ticket for security reasons. Most of ManifoldCF's clients are integrators; they don't generally have permission to include company content without obtaining specific company permission. Luckily FranceLabs has a few examples hanging around or it would be a real challenge to put together a real-world example for you guys, since I don't have licensed and operating copies of the worst offending proprietary repositories available to me anymore. was (Author: kwri...@metacarta.com): I should also note that other prime examples of this issue *cannot* be added to this ticket for security reasons. Most of ManifoldCF's clients are integrators; they don't generally have permission to include company content without obtaining specific company permission. Luckily FranceLabs has a few examples hanging around or it would be a real challenge to put together a real-world example for you guys. > Structural changes in SolrJ since version 7.0.0 have effectively disabled > multipart post > > > Key: SOLR-12798 > URL: https://issues.apache.org/jira/browse/SOLR-12798 > Project: Solr > Issue Type: Improvement > Security Level: Public(Default Security Level. Issues are Public) > Components: SolrJ >Affects Versions: 7.4 >Reporter: Karl Wright >Assignee: Karl Wright >Priority: Major > Attachments: HOT Balloon Trip_Ultra HD.jpg, > SOLR-12798-approach.patch, solr-update-request.txt > > > Project ManifoldCF uses SolrJ to post documents to Solr. When upgrading from > SolrJ 7.0.x to SolrJ 7.4, we encountered significant structural changes to > SolrJ's HttpSolrClient class that seemingly disable any use of multipart > post. This is critical because ManifoldCF's documents often contain metadata > in excess of 4K that therefore cannot be stuffed into a URL. > The changes in question seem to have been performed by Paul Noble on > 10/31/2017, with the introduction of the RequestWriter mechanism. Basically, > if a request has a RequestWriter, it is used exclusively to write the > request, and that overrides the stream mechanism completely. I haven't > chased it back to a specific ticket. > ManifoldCF's usage of SolrJ involves the creation of > ContentStreamUpdateRequests for all posts meant for Solr Cell, and the > creation of UpdateRequests for posts not meant for Solr Cell (as well as for > delete and commit requests). For our release cycle that is taking place > right now, we're shipping a modified version of HttpSolrClient that ignores > the RequestWriter when dealing with ContentStreamUpdateRequests. We > apparently cannot use multipart for all requests because on the Solr side we > get "pfountz Should not get here!" errors on the Solr side when we do, which > generate HTTP error code 500 responses. That should not happen either, in my > opinion. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-12798) Structural changes in SolrJ since version 7.0.0 have effectively disabled multipart post
[ https://issues.apache.org/jira/browse/SOLR-12798?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16629864#comment-16629864 ] Karl Wright commented on SOLR-12798: I've attached a patch, not meant to be applied, which shows the general approach I'd like to explore for a fix. The biggest problems I've had in making this stuff work is figuring out when multipart ought to be used in the HttpSolrClient code. I therefore propose that there be an explicit METHOD type created for multipart post, and that HttpSolrClient pay attention to that when assembling its payload. The payload would be assembled solely using the ContentWriter mechanism, but the metadata would go into multipart form fields rather than the URL. The patch does not contain the modifications to HttpSolrClient yet; I just wanted to initiate the discussion. Does anyone see a problem with this? > Structural changes in SolrJ since version 7.0.0 have effectively disabled > multipart post > > > Key: SOLR-12798 > URL: https://issues.apache.org/jira/browse/SOLR-12798 > Project: Solr > Issue Type: Improvement > Security Level: Public(Default Security Level. Issues are Public) > Components: SolrJ >Affects Versions: 7.4 >Reporter: Karl Wright >Assignee: Karl Wright >Priority: Major > Attachments: HOT Balloon Trip_Ultra HD.jpg, > SOLR-12798-approach.patch, solr-update-request.txt > > > Project ManifoldCF uses SolrJ to post documents to Solr. When upgrading from > SolrJ 7.0.x to SolrJ 7.4, we encountered significant structural changes to > SolrJ's HttpSolrClient class that seemingly disable any use of multipart > post. This is critical because ManifoldCF's documents often contain metadata > in excess of 4K that therefore cannot be stuffed into a URL. > The changes in question seem to have been performed by Paul Noble on > 10/31/2017, with the introduction of the RequestWriter mechanism. Basically, > if a request has a RequestWriter, it is used exclusively to write the > request, and that overrides the stream mechanism completely. I haven't > chased it back to a specific ticket. > ManifoldCF's usage of SolrJ involves the creation of > ContentStreamUpdateRequests for all posts meant for Solr Cell, and the > creation of UpdateRequests for posts not meant for Solr Cell (as well as for > delete and commit requests). For our release cycle that is taking place > right now, we're shipping a modified version of HttpSolrClient that ignores > the RequestWriter when dealing with ContentStreamUpdateRequests. We > apparently cannot use multipart for all requests because on the Solr side we > get "pfountz Should not get here!" errors on the Solr side when we do, which > generate HTTP error code 500 responses. That should not happen either, in my > opinion. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-12798) Structural changes in SolrJ since version 7.0.0 have effectively disabled multipart post
[ https://issues.apache.org/jira/browse/SOLR-12798?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karl Wright updated SOLR-12798: --- Attachment: SOLR-12798-approach.patch > Structural changes in SolrJ since version 7.0.0 have effectively disabled > multipart post > > > Key: SOLR-12798 > URL: https://issues.apache.org/jira/browse/SOLR-12798 > Project: Solr > Issue Type: Improvement > Security Level: Public(Default Security Level. Issues are Public) > Components: SolrJ >Affects Versions: 7.4 >Reporter: Karl Wright >Assignee: Karl Wright >Priority: Major > Attachments: HOT Balloon Trip_Ultra HD.jpg, > SOLR-12798-approach.patch, solr-update-request.txt > > > Project ManifoldCF uses SolrJ to post documents to Solr. When upgrading from > SolrJ 7.0.x to SolrJ 7.4, we encountered significant structural changes to > SolrJ's HttpSolrClient class that seemingly disable any use of multipart > post. This is critical because ManifoldCF's documents often contain metadata > in excess of 4K that therefore cannot be stuffed into a URL. > The changes in question seem to have been performed by Paul Noble on > 10/31/2017, with the introduction of the RequestWriter mechanism. Basically, > if a request has a RequestWriter, it is used exclusively to write the > request, and that overrides the stream mechanism completely. I haven't > chased it back to a specific ticket. > ManifoldCF's usage of SolrJ involves the creation of > ContentStreamUpdateRequests for all posts meant for Solr Cell, and the > creation of UpdateRequests for posts not meant for Solr Cell (as well as for > delete and commit requests). For our release cycle that is taking place > right now, we're shipping a modified version of HttpSolrClient that ignores > the RequestWriter when dealing with ContentStreamUpdateRequests. We > apparently cannot use multipart for all requests because on the Solr side we > get "pfountz Should not get here!" errors on the Solr side when we do, which > generate HTTP error code 500 responses. That should not happen either, in my > opinion. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-12798) Structural changes in SolrJ since version 7.0.0 have effectively disabled multipart post
[ https://issues.apache.org/jira/browse/SOLR-12798?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16629860#comment-16629860 ] Karl Wright commented on SOLR-12798: I should also note that other prime examples of this issue *cannot* be added to this ticket for security reasons. Most of ManifoldCF's clients are integrators; they don't generally have permission to include company content without obtaining specific company permission. Luckily FranceLabs has a few examples hanging around or it would be a real challenge to put together a real-world example for you guys. > Structural changes in SolrJ since version 7.0.0 have effectively disabled > multipart post > > > Key: SOLR-12798 > URL: https://issues.apache.org/jira/browse/SOLR-12798 > Project: Solr > Issue Type: Improvement > Security Level: Public(Default Security Level. Issues are Public) > Components: SolrJ >Affects Versions: 7.4 >Reporter: Karl Wright >Assignee: Karl Wright >Priority: Major > Attachments: HOT Balloon Trip_Ultra HD.jpg, solr-update-request.txt > > > Project ManifoldCF uses SolrJ to post documents to Solr. When upgrading from > SolrJ 7.0.x to SolrJ 7.4, we encountered significant structural changes to > SolrJ's HttpSolrClient class that seemingly disable any use of multipart > post. This is critical because ManifoldCF's documents often contain metadata > in excess of 4K that therefore cannot be stuffed into a URL. > The changes in question seem to have been performed by Paul Noble on > 10/31/2017, with the introduction of the RequestWriter mechanism. Basically, > if a request has a RequestWriter, it is used exclusively to write the > request, and that overrides the stream mechanism completely. I haven't > chased it back to a specific ticket. > ManifoldCF's usage of SolrJ involves the creation of > ContentStreamUpdateRequests for all posts meant for Solr Cell, and the > creation of UpdateRequests for posts not meant for Solr Cell (as well as for > delete and commit requests). For our release cycle that is taking place > right now, we're shipping a modified version of HttpSolrClient that ignores > the RequestWriter when dealing with ContentStreamUpdateRequests. We > apparently cannot use multipart for all requests because on the Solr side we > get "pfountz Should not get here!" errors on the Solr side when we do, which > generate HTTP error code 500 responses. That should not happen either, in my > opinion. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-12798) Structural changes in SolrJ since version 7.0.0 have effectively disabled multipart post
[ https://issues.apache.org/jira/browse/SOLR-12798?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16629853#comment-16629853 ] Karl Wright commented on SOLR-12798: {quote} Specifically the example that generates meaningful metadata and body (multipart) both of which are ending-up used in Solr. {quote} The data has now been provided, and the Solr [INFO] log line for it as well. Are you still asking for the multipart request that *should* be generated by SolrJ for that request? As I've stated, we have had to modify chunks of SolrJ in order to generate that multipart request; with some work we can probably capture it in an HttpClient wire log, but it *is* some work. > Structural changes in SolrJ since version 7.0.0 have effectively disabled > multipart post > > > Key: SOLR-12798 > URL: https://issues.apache.org/jira/browse/SOLR-12798 > Project: Solr > Issue Type: Improvement > Security Level: Public(Default Security Level. Issues are Public) > Components: SolrJ >Affects Versions: 7.4 >Reporter: Karl Wright >Assignee: Karl Wright >Priority: Major > Attachments: HOT Balloon Trip_Ultra HD.jpg, solr-update-request.txt > > > Project ManifoldCF uses SolrJ to post documents to Solr. When upgrading from > SolrJ 7.0.x to SolrJ 7.4, we encountered significant structural changes to > SolrJ's HttpSolrClient class that seemingly disable any use of multipart > post. This is critical because ManifoldCF's documents often contain metadata > in excess of 4K that therefore cannot be stuffed into a URL. > The changes in question seem to have been performed by Paul Noble on > 10/31/2017, with the introduction of the RequestWriter mechanism. Basically, > if a request has a RequestWriter, it is used exclusively to write the > request, and that overrides the stream mechanism completely. I haven't > chased it back to a specific ticket. > ManifoldCF's usage of SolrJ involves the creation of > ContentStreamUpdateRequests for all posts meant for Solr Cell, and the > creation of UpdateRequests for posts not meant for Solr Cell (as well as for > delete and commit requests). For our release cycle that is taking place > right now, we're shipping a modified version of HttpSolrClient that ignores > the RequestWriter when dealing with ContentStreamUpdateRequests. We > apparently cannot use multipart for all requests because on the Solr side we > get "pfountz Should not get here!" errors on the Solr side when we do, which > generate HTTP error code 500 responses. That should not happen either, in my > opinion. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: Solr examples with long metadata needed
8944839,+0.8965591,+0.8986496,+0.9007401,+0.9028305,+0.9049363,+0.9070268,+0.9091325,+0.9112383,+0.913344,+0.915465,+0.9175708,+0.9196918,+0.9218128,+0.9239338,+0.9260548,+0.9281758,+0.930312,+0.9324483,+0.9345846,+0.9367208,+0.9388571,+0.9410086,+0.9431601,+0.9453117,+0.9474632,+0.9496147,+0.9517815,+0.953933,+0.9560998,+0.9582666,+0.9604334,+0.9626154,+0.9647822,+0.9669642,+0.9691463,+0.9713283,+0.9735256,+0.9757076,+0.9779049,+0.9801022,+0.9822995,+0.9844968,+0.9867094,+0.988922,+0.9911345,+0.9933471,+0.9955596,+0.9977722,+1.0=HOT+Balloon+Trip_Ultra+HD.jpg=2017-09-07T09:34:53.000Z_colorant=(0.1431,+0.0606,+0.7141)_model_description=IEC+61966-2.1+Default+RGB+colour+space+-+sRGB=2017-09-07T09:34:53.000Z_resolution=300+dots=32_conditions=view+(0x76696577):+36+bytes_description=sRGB+IEC61966-2.1_image_width=3840+pixels=OCR_conditions_description=Reference+Viewing+Condition+in+IEC61966-2.1_height=2160+pixels}{add=[file:/localhost/OCR/HOT%20Balloon%20Trip_Ultra%20HD.jpg > > (1612689210913849344)]} > > > Julien > > > On 26/09/2018 17:09, Karl Wright wrote: > > Hi ManifoldCF Community, > > > > I need one or two concrete examples of solr [INFO] log messages that > > include very long metadata (>8192). This is apparently critical for > > getting the SolrJ team to be able to understand ManifoldCF's usage of > > solr. If you have such examples around, please be sure that the data > > contained in the info URL is not confidential in any way. > > > > (Julien, you were the last person to run into this -- hopefully that > > image is still around and the metadata can be shared?) > > > > Thanks in advance, > > Karl > > > > > > > > -- > Julien MASSIERA > Directeur développement produit > France Labs – Les experts du Search > Retrouvez-nous à l’Enterprise Search & Discovery Summit à Washington DC > www.francelabs.com > > >
Solr examples with long metadata needed
Hi ManifoldCF Community, I need one or two concrete examples of solr [INFO] log messages that include very long metadata (>8192). This is apparently critical for getting the SolrJ team to be able to understand ManifoldCF's usage of solr. If you have such examples around, please be sure that the data contained in the info URL is not confidential in any way. (Julien, you were the last person to run into this -- hopefully that image is still around and the metadata can be shared?) Thanks in advance, Karl
[jira] [Commented] (SOLR-12798) Structural changes in SolrJ since version 7.0.0 have effectively disabled multipart post
[ https://issues.apache.org/jira/browse/SOLR-12798?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16628879#comment-16628879 ] Karl Wright commented on SOLR-12798: {quote} The data may be generic, but it has to be fed into Solr in one of the accepted parameters. {quote} Um, this stuff has been working for more than a decade. Yes, we're using accepted parameters. {quote} This reason why we insist on an example is because we want to know which parameters are sent as part of query string. {quote} Ok, if that's what you need, I will put out an all points bulletin on the ManifoldCF user list for a Solr INFO message that contains an example of long metadata. How many examples do you need to convince yourselves that we're not making this up? > Structural changes in SolrJ since version 7.0.0 have effectively disabled > multipart post > > > Key: SOLR-12798 > URL: https://issues.apache.org/jira/browse/SOLR-12798 > Project: Solr > Issue Type: Improvement > Security Level: Public(Default Security Level. Issues are Public) > Components: SolrJ >Affects Versions: 7.4 >Reporter: Karl Wright >Assignee: Karl Wright >Priority: Major > > Project ManifoldCF uses SolrJ to post documents to Solr. When upgrading from > SolrJ 7.0.x to SolrJ 7.4, we encountered significant structural changes to > SolrJ's HttpSolrClient class that seemingly disable any use of multipart > post. This is critical because ManifoldCF's documents often contain metadata > in excess of 4K that therefore cannot be stuffed into a URL. > The changes in question seem to have been performed by Paul Noble on > 10/31/2017, with the introduction of the RequestWriter mechanism. Basically, > if a request has a RequestWriter, it is used exclusively to write the > request, and that overrides the stream mechanism completely. I haven't > chased it back to a specific ticket. > ManifoldCF's usage of SolrJ involves the creation of > ContentStreamUpdateRequests for all posts meant for Solr Cell, and the > creation of UpdateRequests for posts not meant for Solr Cell (as well as for > delete and commit requests). For our release cycle that is taking place > right now, we're shipping a modified version of HttpSolrClient that ignores > the RequestWriter when dealing with ContentStreamUpdateRequests. We > apparently cannot use multipart for all requests because on the Solr side we > get "pfountz Should not get here!" errors on the Solr side when we do, which > generate HTTP error code 500 responses. That should not happen either, in my > opinion. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Comment Edited] (SOLR-12798) Structural changes in SolrJ since version 7.0.0 have effectively disabled multipart post
[ https://issues.apache.org/jira/browse/SOLR-12798?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16628630#comment-16628630 ] Karl Wright edited comment on SOLR-12798 at 9/26/18 11:49 AM: -- [~shalinmangar], there's no general answer to that question, because there's no one definitive example of metadata. I refer you to the project page for ManifoldCF here: https://manifoldcf.apache.org/en_US/index.html#What+Is+Apache+ManifoldCF%3F Just for fun, I dug up a ManifoldCF ticket related to this issue, involving the email connector: https://issues.apache.org/jira/browse/CONNECTORS-1408 was (Author: kwri...@metacarta.com): [~shalinmangar], there's no general answer to that question, because there's no one definitive example of metadata. I refer you to the project page for ManifoldCF here: https://manifoldcf.apache.org/en_US/index.html#What+Is+Apache+ManifoldCF%3F > Structural changes in SolrJ since version 7.0.0 have effectively disabled > multipart post > > > Key: SOLR-12798 > URL: https://issues.apache.org/jira/browse/SOLR-12798 > Project: Solr > Issue Type: Improvement > Security Level: Public(Default Security Level. Issues are Public) > Components: SolrJ >Affects Versions: 7.4 >Reporter: Karl Wright >Assignee: Karl Wright >Priority: Major > > Project ManifoldCF uses SolrJ to post documents to Solr. When upgrading from > SolrJ 7.0.x to SolrJ 7.4, we encountered significant structural changes to > SolrJ's HttpSolrClient class that seemingly disable any use of multipart > post. This is critical because ManifoldCF's documents often contain metadata > in excess of 4K that therefore cannot be stuffed into a URL. > The changes in question seem to have been performed by Paul Noble on > 10/31/2017, with the introduction of the RequestWriter mechanism. Basically, > if a request has a RequestWriter, it is used exclusively to write the > request, and that overrides the stream mechanism completely. I haven't > chased it back to a specific ticket. > ManifoldCF's usage of SolrJ involves the creation of > ContentStreamUpdateRequests for all posts meant for Solr Cell, and the > creation of UpdateRequests for posts not meant for Solr Cell (as well as for > delete and commit requests). For our release cycle that is taking place > right now, we're shipping a modified version of HttpSolrClient that ignores > the RequestWriter when dealing with ContentStreamUpdateRequests. We > apparently cannot use multipart for all requests because on the Solr side we > get "pfountz Should not get here!" errors on the Solr side when we do, which > generate HTTP error code 500 responses. That should not happen either, in my > opinion. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-12798) Structural changes in SolrJ since version 7.0.0 have effectively disabled multipart post
[ https://issues.apache.org/jira/browse/SOLR-12798?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16628630#comment-16628630 ] Karl Wright commented on SOLR-12798: [~shalinmangar], there's no general answer to that question, because there's no one definitive example of metadata. I refer you to the project page for ManifoldCF here: https://manifoldcf.apache.org/en_US/index.html#What+Is+Apache+ManifoldCF%3F > Structural changes in SolrJ since version 7.0.0 have effectively disabled > multipart post > > > Key: SOLR-12798 > URL: https://issues.apache.org/jira/browse/SOLR-12798 > Project: Solr > Issue Type: Improvement > Security Level: Public(Default Security Level. Issues are Public) > Components: SolrJ >Affects Versions: 7.4 >Reporter: Karl Wright >Assignee: Karl Wright >Priority: Major > > Project ManifoldCF uses SolrJ to post documents to Solr. When upgrading from > SolrJ 7.0.x to SolrJ 7.4, we encountered significant structural changes to > SolrJ's HttpSolrClient class that seemingly disable any use of multipart > post. This is critical because ManifoldCF's documents often contain metadata > in excess of 4K that therefore cannot be stuffed into a URL. > The changes in question seem to have been performed by Paul Noble on > 10/31/2017, with the introduction of the RequestWriter mechanism. Basically, > if a request has a RequestWriter, it is used exclusively to write the > request, and that overrides the stream mechanism completely. I haven't > chased it back to a specific ticket. > ManifoldCF's usage of SolrJ involves the creation of > ContentStreamUpdateRequests for all posts meant for Solr Cell, and the > creation of UpdateRequests for posts not meant for Solr Cell (as well as for > delete and commit requests). For our release cycle that is taking place > right now, we're shipping a modified version of HttpSolrClient that ignores > the RequestWriter when dealing with ContentStreamUpdateRequests. We > apparently cannot use multipart for all requests because on the Solr side we > get "pfountz Should not get here!" errors on the Solr side when we do, which > generate HTTP error code 500 responses. That should not happen either, in my > opinion. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-12798) Structural changes in SolrJ since version 7.0.0 have effectively disabled multipart post
[ https://issues.apache.org/jira/browse/SOLR-12798?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karl Wright updated SOLR-12798: --- Issue Type: Improvement (was: Bug) > Structural changes in SolrJ since version 7.0.0 have effectively disabled > multipart post > > > Key: SOLR-12798 > URL: https://issues.apache.org/jira/browse/SOLR-12798 > Project: Solr > Issue Type: Improvement > Security Level: Public(Default Security Level. Issues are Public) > Components: SolrJ >Affects Versions: 7.4 >Reporter: Karl Wright >Assignee: Karl Wright >Priority: Major > > Project ManifoldCF uses SolrJ to post documents to Solr. When upgrading from > SolrJ 7.0.x to SolrJ 7.4, we encountered significant structural changes to > SolrJ's HttpSolrClient class that seemingly disable any use of multipart > post. This is critical because ManifoldCF's documents often contain metadata > in excess of 4K that therefore cannot be stuffed into a URL. > The changes in question seem to have been performed by Paul Noble on > 10/31/2017, with the introduction of the RequestWriter mechanism. Basically, > if a request has a RequestWriter, it is used exclusively to write the > request, and that overrides the stream mechanism completely. I haven't > chased it back to a specific ticket. > ManifoldCF's usage of SolrJ involves the creation of > ContentStreamUpdateRequests for all posts meant for Solr Cell, and the > creation of UpdateRequests for posts not meant for Solr Cell (as well as for > delete and commit requests). For our release cycle that is taking place > right now, we're shipping a modified version of HttpSolrClient that ignores > the RequestWriter when dealing with ContentStreamUpdateRequests. We > apparently cannot use multipart for all requests because on the Solr side we > get "pfountz Should not get here!" errors on the Solr side when we do, which > generate HTTP error code 500 responses. That should not happen either, in my > opinion. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Assigned] (SOLR-12798) Structural changes in SolrJ since version 7.0.0 have effectively disabled multipart post
[ https://issues.apache.org/jira/browse/SOLR-12798?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karl Wright reassigned SOLR-12798: -- Assignee: Karl Wright > Structural changes in SolrJ since version 7.0.0 have effectively disabled > multipart post > > > Key: SOLR-12798 > URL: https://issues.apache.org/jira/browse/SOLR-12798 > Project: Solr > Issue Type: Bug > Security Level: Public(Default Security Level. Issues are Public) > Components: SolrJ >Affects Versions: 7.4 >Reporter: Karl Wright >Assignee: Karl Wright >Priority: Major > > Project ManifoldCF uses SolrJ to post documents to Solr. When upgrading from > SolrJ 7.0.x to SolrJ 7.4, we encountered significant structural changes to > SolrJ's HttpSolrClient class that seemingly disable any use of multipart > post. This is critical because ManifoldCF's documents often contain metadata > in excess of 4K that therefore cannot be stuffed into a URL. > The changes in question seem to have been performed by Paul Noble on > 10/31/2017, with the introduction of the RequestWriter mechanism. Basically, > if a request has a RequestWriter, it is used exclusively to write the > request, and that overrides the stream mechanism completely. I haven't > chased it back to a specific ticket. > ManifoldCF's usage of SolrJ involves the creation of > ContentStreamUpdateRequests for all posts meant for Solr Cell, and the > creation of UpdateRequests for posts not meant for Solr Cell (as well as for > delete and commit requests). For our release cycle that is taking place > right now, we're shipping a modified version of HttpSolrClient that ignores > the RequestWriter when dealing with ContentStreamUpdateRequests. We > apparently cannot use multipart for all requests because on the Solr side we > get "pfountz Should not get here!" errors on the Solr side when we do, which > generate HTTP error code 500 responses. That should not happen either, in my > opinion. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-12798) Structural changes in SolrJ since version 7.0.0 have effectively disabled multipart post
[ https://issues.apache.org/jira/browse/SOLR-12798?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16628354#comment-16628354 ] Karl Wright commented on SOLR-12798: Ok, thanks for the clarification. I will propose SolrJ changes to allow multipart form transport as a first-class citizen, using the ContentWriter construct, and attach those as a patch to this ticket. The other fixes I will propose separately. Or, if you want to tackle this, I'd be happy to hand it to you. Please let me know. > Structural changes in SolrJ since version 7.0.0 have effectively disabled > multipart post > > > Key: SOLR-12798 > URL: https://issues.apache.org/jira/browse/SOLR-12798 > Project: Solr > Issue Type: Bug > Security Level: Public(Default Security Level. Issues are Public) > Components: SolrJ >Affects Versions: 7.4 >Reporter: Karl Wright >Priority: Major > > Project ManifoldCF uses SolrJ to post documents to Solr. When upgrading from > SolrJ 7.0.x to SolrJ 7.4, we encountered significant structural changes to > SolrJ's HttpSolrClient class that seemingly disable any use of multipart > post. This is critical because ManifoldCF's documents often contain metadata > in excess of 4K that therefore cannot be stuffed into a URL. > The changes in question seem to have been performed by Paul Noble on > 10/31/2017, with the introduction of the RequestWriter mechanism. Basically, > if a request has a RequestWriter, it is used exclusively to write the > request, and that overrides the stream mechanism completely. I haven't > chased it back to a specific ticket. > ManifoldCF's usage of SolrJ involves the creation of > ContentStreamUpdateRequests for all posts meant for Solr Cell, and the > creation of UpdateRequests for posts not meant for Solr Cell (as well as for > delete and commit requests). For our release cycle that is taking place > right now, we're shipping a modified version of HttpSolrClient that ignores > the RequestWriter when dealing with ContentStreamUpdateRequests. We > apparently cannot use multipart for all requests because on the Solr side we > get "pfountz Should not get here!" errors on the Solr side when we do, which > generate HTTP error code 500 responses. That should not happen either, in my > opinion. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-12798) Structural changes in SolrJ since version 7.0.0 have effectively disabled multipart post
[ https://issues.apache.org/jira/browse/SOLR-12798?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16628307#comment-16628307 ] Karl Wright commented on SOLR-12798: {quote} this no longer is the case {quote} That's good news; I can change things in ManifoldCF accordingly, since we no longer have to enforce a maximum document size limit in that case then. {quote} I have fixed this problem in the current SolrJ {quote} So there's a fix for multipart post usage? Is this committed to master? How do you turn it on, or does it do this automatically? Once that's there, it would be straightforward to add my other fixes; I'm a Lucene/Solr committer now as well, so I can ticket and propose them and they will get done this time. > Structural changes in SolrJ since version 7.0.0 have effectively disabled > multipart post > > > Key: SOLR-12798 > URL: https://issues.apache.org/jira/browse/SOLR-12798 > Project: Solr > Issue Type: Bug > Security Level: Public(Default Security Level. Issues are Public) > Components: SolrJ >Affects Versions: 7.4 >Reporter: Karl Wright >Priority: Major > > Project ManifoldCF uses SolrJ to post documents to Solr. When upgrading from > SolrJ 7.0.x to SolrJ 7.4, we encountered significant structural changes to > SolrJ's HttpSolrClient class that seemingly disable any use of multipart > post. This is critical because ManifoldCF's documents often contain metadata > in excess of 4K that therefore cannot be stuffed into a URL. > The changes in question seem to have been performed by Paul Noble on > 10/31/2017, with the introduction of the RequestWriter mechanism. Basically, > if a request has a RequestWriter, it is used exclusively to write the > request, and that overrides the stream mechanism completely. I haven't > chased it back to a specific ticket. > ManifoldCF's usage of SolrJ involves the creation of > ContentStreamUpdateRequests for all posts meant for Solr Cell, and the > creation of UpdateRequests for posts not meant for Solr Cell (as well as for > delete and commit requests). For our release cycle that is taking place > right now, we're shipping a modified version of HttpSolrClient that ignores > the RequestWriter when dealing with ContentStreamUpdateRequests. We > apparently cannot use multipart for all requests because on the Solr side we > get "pfountz Should not get here!" errors on the Solr side when we do, which > generate HTTP error code 500 responses. That should not happen either, in my > opinion. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-12798) Structural changes in SolrJ since version 7.0.0 have effectively disabled multipart post
[ https://issues.apache.org/jira/browse/SOLR-12798?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16627826#comment-16627826 ] Karl Wright commented on SOLR-12798: [~dsmiley], whereas it doesn't seem to have been appreciated, SolrJ did have reasonable support for multipart post some few major version ago but I appreciate the fact that this is no longer a priority. I'm happy to help get this back to a point that MCF needs. > Structural changes in SolrJ since version 7.0.0 have effectively disabled > multipart post > > > Key: SOLR-12798 > URL: https://issues.apache.org/jira/browse/SOLR-12798 > Project: Solr > Issue Type: Bug > Security Level: Public(Default Security Level. Issues are Public) > Components: SolrJ >Affects Versions: 7.4 >Reporter: Karl Wright >Priority: Major > > Project ManifoldCF uses SolrJ to post documents to Solr. When upgrading from > SolrJ 7.0.x to SolrJ 7.4, we encountered significant structural changes to > SolrJ's HttpSolrClient class that seemingly disable any use of multipart > post. This is critical because ManifoldCF's documents often contain metadata > in excess of 4K that therefore cannot be stuffed into a URL. > The changes in question seem to have been performed by Paul Noble on > 10/31/2017, with the introduction of the RequestWriter mechanism. Basically, > if a request has a RequestWriter, it is used exclusively to write the > request, and that overrides the stream mechanism completely. I haven't > chased it back to a specific ticket. > ManifoldCF's usage of SolrJ involves the creation of > ContentStreamUpdateRequests for all posts meant for Solr Cell, and the > creation of UpdateRequests for posts not meant for Solr Cell (as well as for > delete and commit requests). For our release cycle that is taking place > right now, we're shipping a modified version of HttpSolrClient that ignores > the RequestWriter when dealing with ContentStreamUpdateRequests. We > apparently cannot use multipart for all requests because on the Solr side we > get "pfountz Should not get here!" errors on the Solr side when we do, which > generate HTTP error code 500 responses. That should not happen either, in my > opinion. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Comment Edited] (SOLR-12798) Structural changes in SolrJ since version 7.0.0 have effectively disabled multipart post
[ https://issues.apache.org/jira/browse/SOLR-12798?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16627698#comment-16627698 ] Karl Wright edited comment on SOLR-12798 at 9/25/18 5:39 PM: - [~dsmiley], there are two problems with using UpdateRequest. First, as you point out, the entire document has to hit memory. This is problematic because sometimes these documents are massive and nevertheless Tika needs all of them to extract stuff from them. So we allow two modes of operation: (1) Via Solr Cell, in which case we use ContentStreamUpdateRequest, which embeds a stream and forms the request without having the entire document hit memory, and (2) Via UpdateRequest, and SolrinputDocument, but only after Tika has been invoked, and with a length limit. Even then we have problems with people running out of memory unless they are very careful, given that there are sometimes dozens of indexing requests active at any one time. This information, by the way, has nothing to do with length limits on the URL, since those are determined solely by metadata, which can be large and is independent of the main content stream. URL limits get in the way just as readily when we use mode (2) as when we use mode (1). {quote} Note the existence of UpdateRequest.setDocIterator(Iterator) which can be helpful in streaming and materializing documents on the fly. {quote} Yes, of course it can, but the way SolrJ is constructed it makes no use of this. In fact, it currently doesn't use multipart post at all, unless I override much functionality in order to force it to do so. was (Author: kwri...@metacarta.com): [~dsmiley], there are two problems with using UpdateRequest. First, as you point out, the entire document has to hit memory. This is problematic because sometimes these documents are massive and nevertheless Tika needs all of them to extract stuff from them. So we allow two modes of operation: (1) Via Solr Cell, in which case we use ContentStreamUpdateRequest, which embeds a stream and forms the request without having the entire document hit memory, and (2) Via UpdateRequest, and SolrinputDocument, but only after Tika has been invoked, and with a length limit. Even then we have problems with people running out of memory unless they are very careful, given that there are sometimes dozens of indexing requests active at any one time. This information, by the way, has nothing to do with length limits on the URL, since those are determined solely by metadata, which can be large and is independent of the main content stream. URL limits get in the way just as readily when we use mode (2) as when we use mode (1). > Structural changes in SolrJ since version 7.0.0 have effectively disabled > multipart post > > > Key: SOLR-12798 > URL: https://issues.apache.org/jira/browse/SOLR-12798 > Project: Solr > Issue Type: Bug > Security Level: Public(Default Security Level. Issues are Public) > Components: SolrJ >Affects Versions: 7.4 >Reporter: Karl Wright >Priority: Major > > Project ManifoldCF uses SolrJ to post documents to Solr. When upgrading from > SolrJ 7.0.x to SolrJ 7.4, we encountered significant structural changes to > SolrJ's HttpSolrClient class that seemingly disable any use of multipart > post. This is critical because ManifoldCF's documents often contain metadata > in excess of 4K that therefore cannot be stuffed into a URL. > The changes in question seem to have been performed by Paul Noble on > 10/31/2017, with the introduction of the RequestWriter mechanism. Basically, > if a request has a RequestWriter, it is used exclusively to write the > request, and that overrides the stream mechanism completely. I haven't > chased it back to a specific ticket. > ManifoldCF's usage of SolrJ involves the creation of > ContentStreamUpdateRequests for all posts meant for Solr Cell, and the > creation of UpdateRequests for posts not meant for Solr Cell (as well as for > delete and commit requests). For our release cycle that is taking place > right now, we're shipping a modified version of HttpSolrClient that ignores > the RequestWriter when dealing with ContentStreamUpdateRequests. We > apparently cannot use multipart for all requests because on the Solr side we > get "pfountz Should not get here!" errors on the Solr side when we do, which > generate HTTP error code 500 responses. That should not happen either, in my > opinion. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-12798) Structural changes in SolrJ since version 7.0.0 have effectively disabled multipart post
[ https://issues.apache.org/jira/browse/SOLR-12798?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16627698#comment-16627698 ] Karl Wright commented on SOLR-12798: [~dsmiley], there are two problems with using UpdateRequest. First, as you point out, the entire document has to hit memory. This is problematic because sometimes these documents are massive and nevertheless Tika needs all of them to extract stuff from them. So we allow two modes of operation: (1) Via Solr Cell, in which case we use ContentStreamUpdateRequest, which embeds a stream and forms the request without having the entire document hit memory, and (2) Via UpdateRequest, and SolrinputDocument, but only after Tika has been invoked, and with a length limit. Even then we have problems with people running out of memory unless they are very careful, given that there are sometimes dozens of indexing requests active at any one time. This information, by the way, has nothing to do with length limits on the URL, since those are determined solely by metadata, which can be large and is independent of the main content stream. URL limits get in the way just as readily when we use mode (2) as when we use mode (1). > Structural changes in SolrJ since version 7.0.0 have effectively disabled > multipart post > > > Key: SOLR-12798 > URL: https://issues.apache.org/jira/browse/SOLR-12798 > Project: Solr > Issue Type: Bug > Security Level: Public(Default Security Level. Issues are Public) > Components: SolrJ >Affects Versions: 7.4 >Reporter: Karl Wright >Priority: Major > > Project ManifoldCF uses SolrJ to post documents to Solr. When upgrading from > SolrJ 7.0.x to SolrJ 7.4, we encountered significant structural changes to > SolrJ's HttpSolrClient class that seemingly disable any use of multipart > post. This is critical because ManifoldCF's documents often contain metadata > in excess of 4K that therefore cannot be stuffed into a URL. > The changes in question seem to have been performed by Paul Noble on > 10/31/2017, with the introduction of the RequestWriter mechanism. Basically, > if a request has a RequestWriter, it is used exclusively to write the > request, and that overrides the stream mechanism completely. I haven't > chased it back to a specific ticket. > ManifoldCF's usage of SolrJ involves the creation of > ContentStreamUpdateRequests for all posts meant for Solr Cell, and the > creation of UpdateRequests for posts not meant for Solr Cell (as well as for > delete and commit requests). For our release cycle that is taking place > right now, we're shipping a modified version of HttpSolrClient that ignores > the RequestWriter when dealing with ContentStreamUpdateRequests. We > apparently cannot use multipart for all requests because on the Solr side we > get "pfountz Should not get here!" errors on the Solr side when we do, which > generate HTTP error code 500 responses. That should not happen either, in my > opinion. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-12798) Structural changes in SolrJ since version 7.0.0 have effectively disabled multipart post
[ https://issues.apache.org/jira/browse/SOLR-12798?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16627547#comment-16627547 ] Karl Wright commented on SOLR-12798: [~noble.paul] 'We are assuming your usecase can only be implemented using a multipart request. Can we see what do you send in the request parameters?' That's kind of a silly question if you don't mind me saying so. MCF is a framework with dozens of connectors for accessing different kinds of document repositories. A "document" in ManifoldCF consists of: - A content stream of infinite length - Unlimited metadata, in the form of name/valuelist pairs Documents that have large amounts of metadata are common. The details vary considerably by source repository. For only one example, we have one client who seemingly specializes in indexing image content. The images are run through Tika, which takes these images and produces a zero-length text file and sometimes 100K bytes of metadata text, in multiple metadata fields. I hope that's enough to demonstrate why it is impossible to expect all the metadata for a document to fit in the URL. > Structural changes in SolrJ since version 7.0.0 have effectively disabled > multipart post > > > Key: SOLR-12798 > URL: https://issues.apache.org/jira/browse/SOLR-12798 > Project: Solr > Issue Type: Bug > Security Level: Public(Default Security Level. Issues are Public) > Components: SolrJ >Affects Versions: 7.4 >Reporter: Karl Wright >Priority: Major > > Project ManifoldCF uses SolrJ to post documents to Solr. When upgrading from > SolrJ 7.0.x to SolrJ 7.4, we encountered significant structural changes to > SolrJ's HttpSolrClient class that seemingly disable any use of multipart > post. This is critical because ManifoldCF's documents often contain metadata > in excess of 4K that therefore cannot be stuffed into a URL. > The changes in question seem to have been performed by Paul Noble on > 10/31/2017, with the introduction of the RequestWriter mechanism. Basically, > if a request has a RequestWriter, it is used exclusively to write the > request, and that overrides the stream mechanism completely. I haven't > chased it back to a specific ticket. > ManifoldCF's usage of SolrJ involves the creation of > ContentStreamUpdateRequests for all posts meant for Solr Cell, and the > creation of UpdateRequests for posts not meant for Solr Cell (as well as for > delete and commit requests). For our release cycle that is taking place > right now, we're shipping a modified version of HttpSolrClient that ignores > the RequestWriter when dealing with ContentStreamUpdateRequests. We > apparently cannot use multipart for all requests because on the Solr side we > get "pfountz Should not get here!" errors on the Solr side when we do, which > generate HTTP error code 500 responses. That should not happen either, in my > opinion. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-12798) Structural changes in SolrJ since version 7.0.0 have effectively disabled multipart post
[ https://issues.apache.org/jira/browse/SOLR-12798?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16627138#comment-16627138 ] Karl Wright commented on SOLR-12798: It looks like the only implementer of ContentWriter is StringPayloadContentWriter, which just furnishes a string for output, correct? In order to work within that framework, ContentStreamUpdateHandler would need a streaming ContentWriter implementation that pulls from the input and writes to the output. That seems to be missing. And then this has nothing whatsoever to do with how the content is actually transmitted -- it seems that the assumption is that the new ContentWriter stuff all goes via PUT with metadata in the URL. That's not good for two reasons: first, the URL length problems I've already mentioned, and second -- Solr Cell uses the "name" part of the multipart post to inject its own bit of metadata into the document, and there would be no way to transmit that anymore. Logic is still therefore going to be needed to use multipart forms under specific circumstances. Maybe there needs to be a useMultipart() method in all Requests, and HttpSolrClient should look at that to decide whether to use multipart or standard PUT? > Structural changes in SolrJ since version 7.0.0 have effectively disabled > multipart post > > > Key: SOLR-12798 > URL: https://issues.apache.org/jira/browse/SOLR-12798 > Project: Solr > Issue Type: Bug > Security Level: Public(Default Security Level. Issues are Public) > Components: SolrJ >Affects Versions: 7.4 >Reporter: Karl Wright >Priority: Major > > Project ManifoldCF uses SolrJ to post documents to Solr. When upgrading from > SolrJ 7.0.x to SolrJ 7.4, we encountered significant structural changes to > SolrJ's HttpSolrClient class that seemingly disable any use of multipart > post. This is critical because ManifoldCF's documents often contain metadata > in excess of 4K that therefore cannot be stuffed into a URL. > The changes in question seem to have been performed by Paul Noble on > 10/31/2017, with the introduction of the RequestWriter mechanism. Basically, > if a request has a RequestWriter, it is used exclusively to write the > request, and that overrides the stream mechanism completely. I haven't > chased it back to a specific ticket. > ManifoldCF's usage of SolrJ involves the creation of > ContentStreamUpdateRequests for all posts meant for Solr Cell, and the > creation of UpdateRequests for posts not meant for Solr Cell (as well as for > delete and commit requests). For our release cycle that is taking place > right now, we're shipping a modified version of HttpSolrClient that ignores > the RequestWriter when dealing with ContentStreamUpdateRequests. We > apparently cannot use multipart for all requests because on the Solr side we > get "pfountz Should not get here!" errors on the Solr side when we do, which > generate HTTP error code 500 responses. That should not happen either, in my > opinion. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-12798) Structural changes in SolrJ since version 7.0.0 have effectively disabled multipart post
[ https://issues.apache.org/jira/browse/SOLR-12798?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16627098#comment-16627098 ] Karl Wright commented on SOLR-12798: Hi [~noble.paul], as I explained before, we have document metadata in excess of the maximum URL length quite often. In fact, it's the typical case. That is why we must use multipart post in this application. My rough estimate of the percentage of ManifoldCF users who fall into this category is greater than 90%. > Structural changes in SolrJ since version 7.0.0 have effectively disabled > multipart post > > > Key: SOLR-12798 > URL: https://issues.apache.org/jira/browse/SOLR-12798 > Project: Solr > Issue Type: Bug > Security Level: Public(Default Security Level. Issues are Public) > Components: SolrJ >Affects Versions: 7.4 >Reporter: Karl Wright >Priority: Major > > Project ManifoldCF uses SolrJ to post documents to Solr. When upgrading from > SolrJ 7.0.x to SolrJ 7.4, we encountered significant structural changes to > SolrJ's HttpSolrClient class that seemingly disable any use of multipart > post. This is critical because ManifoldCF's documents often contain metadata > in excess of 4K that therefore cannot be stuffed into a URL. > The changes in question seem to have been performed by Paul Noble on > 10/31/2017, with the introduction of the RequestWriter mechanism. Basically, > if a request has a RequestWriter, it is used exclusively to write the > request, and that overrides the stream mechanism completely. I haven't > chased it back to a specific ticket. > ManifoldCF's usage of SolrJ involves the creation of > ContentStreamUpdateRequests for all posts meant for Solr Cell, and the > creation of UpdateRequests for posts not meant for Solr Cell (as well as for > delete and commit requests). For our release cycle that is taking place > right now, we're shipping a modified version of HttpSolrClient that ignores > the RequestWriter when dealing with ContentStreamUpdateRequests. We > apparently cannot use multipart for all requests because on the Solr side we > get "pfountz Should not get here!" errors on the Solr side when we do, which > generate HTTP error code 500 responses. That should not happen either, in my > opinion. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-12798) Structural changes in SolrJ since version 7.0.0 have effectively disabled multipart post
[ https://issues.apache.org/jira/browse/SOLR-12798?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16627028#comment-16627028 ] Karl Wright commented on SOLR-12798: [~noble.paul] We have a custom implementation because SolrJ and indeed HttpComponents/HttpClient have problems we're forced to work around. These have been raised before but so far not taken too seriously apparently. The need to workaround things has gotten even more significant with the latest release. ModifiedHttpSolrClient is a derivation of HttpSolrClient. The method overridden, createMethod(), is a direct copy of HttpSolrClient.createMethod() with certain very specific changes. These are apparently all still necessary. I've included the method code below. If I disable this custom method, and use standard code, I *never* get multipart form posts at all. That is unacceptable in this application. With the current modifications included below, I get multipart posts for everything, including for deletions, which breaks because Solr doesn't like that. I'm asking for advice as to how to get multipart posts only for documents, either ones transmitted by ContentStreamUpdateHandler or UpdateHandler.add(SolrInputDocument). {code} @Override protected HttpRequestBase createMethod(SolrRequest request, String collection) throws IOException, SolrServerException { if (request instanceof V2RequestSupport) { request = ((V2RequestSupport) request).getV2Request(); } SolrParams params = request.getParams(); RequestWriter.ContentWriter contentWriter = requestWriter.getContentWriter(request); Collection streams = contentWriter == null ? requestWriter.getContentStreams(request) : null; String path = requestWriter.getPath(request); if (path == null || !path.startsWith("/")) { path = DEFAULT_PATH; } ResponseParser parser = request.getResponseParser(); if (parser == null) { parser = this.parser; } // The parser 'wt=' and 'version=' params are used instead of the original // params ModifiableSolrParams wparams = new ModifiableSolrParams(params); if (parser != null) { wparams.set(CommonParams.WT, parser.getWriterType()); wparams.set(CommonParams.VERSION, parser.getVersion()); } if (invariantParams != null) { wparams.add(invariantParams); } String basePath = baseUrl; if (collection != null) basePath += "/" + collection; if (request instanceof V2Request) { if (System.getProperty("solr.v2RealPath") == null) { basePath = baseUrl.replace("/solr", "/api"); } else { basePath = baseUrl + "/v2"; } } if (SolrRequest.METHOD.GET == request.getMethod()) { if (streams != null || contentWriter != null) { throw new SolrException(SolrException.ErrorCode.BAD_REQUEST, "GET can't send streams!"); } return new HttpGet(basePath + path + toQueryString(wparams, false)); } if (SolrRequest.METHOD.DELETE == request.getMethod()) { return new HttpDelete(basePath + path + toQueryString(wparams, false)); } if (SolrRequest.METHOD.POST == request.getMethod() || SolrRequest.METHOD.PUT == request.getMethod()) { // UpdateRequest uses PUT now, and ContentStreamUpdateHandler uses POST. // We must override PUT with POST if multipart is on. // If useMultipart is on, we fall back to getting streams directly from the request. final boolean mustUseMultipart; final SolrRequest.METHOD methodToUse; if (this.useMultiPartPost) { final Collection requestStreams = request.getContentStreams(); mustUseMultipart = requestStreams != null && requestStreams.size() > 0; if (mustUseMultipart) { System.out.println("Overriding with multipart post"); streams = requestStreams; methodToUse = SolrRequest.METHOD.POST; } else { methodToUse = request.getMethod(); } } else { mustUseMultipart = false; methodToUse = request.getMethod(); } //System.out.println("Post or put"); String url = basePath + path; /* boolean hasNullStreamName = false; if (streams != null) { for (ContentStream cs : streams) { if (cs.getName() == null) { hasNullStreamName = true; break; } } } */ /* final boolean isMultipart = ((this.useMultiPartPost && SolrRequest.METHOD.POST == methodToUse) || (streams != null && streams.size() > 1)) && !hasNullStreamName; */ final boolean isMultipart = this.useMultiPartPost && SolrRequest.METHOD.POST == methodToUse && (streams != null &
[jira] [Commented] (SOLR-12798) Structural changes in SolrJ since version 7.0.0 have effectively disabled multipart post
[ https://issues.apache.org/jira/browse/SOLR-12798?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16627010#comment-16627010 ] Karl Wright commented on SOLR-12798: I'm looking for workarounds -- initially, at least. What I've tried is adding the following code in the POST/PUT section of the HttpSolrClient code: {code} // UpdateRequest uses PUT now, and ContentStreamUpdateHandler uses POST. // We must override PUT with POST if multipart is on. // If useMultipart is on, we fall back to getting streams directly from the request. final boolean mustUseMultipart; final SolrRequest.METHOD methodToUse; if (this.useMultiPartPost) { final Collection requestStreams = request.getContentStreams(); mustUseMultipart = requestStreams != null && requestStreams.size() > 0; if (mustUseMultipart) { System.out.println("Overriding with multipart post"); streams = requestStreams; methodToUse = SolrRequest.METHOD.POST; } else { methodToUse = request.getMethod(); } } else { mustUseMultipart = false; methodToUse = request.getMethod(); } //System.out.println("Post or put"); String url = basePath + path; /* boolean hasNullStreamName = false; if (streams != null) { for (ContentStream cs : streams) { if (cs.getName() == null) { hasNullStreamName = true; break; } } } */ /* final boolean isMultipart = ((this.useMultiPartPost && SolrRequest.METHOD.POST == methodToUse) || (streams != null && streams.size() > 1)) && !hasNullStreamName; */ final boolean isMultipart = this.useMultiPartPost && SolrRequest.METHOD.POST == methodToUse && (streams != null && streams.size() >= 1); System.out.println("isMultipart = "+isMultipart); {code} The problem is that when multipart post is used for document delete requests, they fail because the stream is empty. And the code above doesn't distinguish between UpdateRequests that include real documents and UpdateRequests that are delete requests. Any ideas? > Structural changes in SolrJ since version 7.0.0 have effectively disabled > multipart post > > > Key: SOLR-12798 > URL: https://issues.apache.org/jira/browse/SOLR-12798 > Project: Solr > Issue Type: Bug > Security Level: Public(Default Security Level. Issues are Public) > Components: SolrJ >Affects Versions: 7.4 >Reporter: Karl Wright >Priority: Major > > Project ManifoldCF uses SolrJ to post documents to Solr. When upgrading from > SolrJ 7.0.x to SolrJ 7.4, we encountered significant structural changes to > SolrJ's HttpSolrClient class that seemingly disable any use of multipart > post. This is critical because ManifoldCF's documents often contain metadata > in excess of 4K that therefore cannot be stuffed into a URL. > The changes in question seem to have been performed by Paul Noble on > 10/31/2017, with the introduction of the RequestWriter mechanism. Basically, > if a request has a RequestWriter, it is used exclusively to write the > request, and that overrides the stream mechanism completely. I haven't > chased it back to a specific ticket. > ManifoldCF's usage of SolrJ involves the creation of > ContentStreamUpdateRequests for all posts meant for Solr Cell, and the > creation of UpdateRequests for posts not meant for Solr Cell (as well as for > delete and commit requests). For our release cycle that is taking place > right now, we're shipping a modified version of HttpSolrClient that ignores > the RequestWriter when dealing with ContentStreamUpdateRequests. We > apparently cannot use multipart for all requests because on the Solr side we > get "pfountz Should not get here!" errors on the Solr side when we do, which > generate HTTP error code 500 responses. That should not happen either, in my > opinion. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Issue Comment Deleted] (SOLR-12798) Structural changes in SolrJ since version 7.0.0 have effectively disabled multipart post
[ https://issues.apache.org/jira/browse/SOLR-12798?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karl Wright updated SOLR-12798: --- Comment: was deleted (was: Thinking about this: I think the fix might well be to add a decent implementation of getContentStreams() to BinaryRequestWriter, and then prioritizing the use of content streams in HttpSolrClient when useMultipart is true. That would fix the basic problem, if it doesn't introduce other ones. For the ManifoldCF project's immediate release concerns, I'd have to create a ModifiedUpdateRequest class and a ModifiedBinaryRequestWriter class, if they're not locked down anyway, and use ModifiedUpdateRequest instead of UpdateRequest whenever I need to add SolrInputDocuments. I'll check out whether this would work. That makes some six SolrJ classes that ManifoldCF needs to override, however, just to get multipart post to work properly. I think it's time to make multipart post a first-class citizen for SolrJ, no? ) > Structural changes in SolrJ since version 7.0.0 have effectively disabled > multipart post > > > Key: SOLR-12798 > URL: https://issues.apache.org/jira/browse/SOLR-12798 > Project: Solr > Issue Type: Bug > Security Level: Public(Default Security Level. Issues are Public) > Components: SolrJ >Affects Versions: 7.4 >Reporter: Karl Wright >Priority: Major > > Project ManifoldCF uses SolrJ to post documents to Solr. When upgrading from > SolrJ 7.0.x to SolrJ 7.4, we encountered significant structural changes to > SolrJ's HttpSolrClient class that seemingly disable any use of multipart > post. This is critical because ManifoldCF's documents often contain metadata > in excess of 4K that therefore cannot be stuffed into a URL. > The changes in question seem to have been performed by Paul Noble on > 10/31/2017, with the introduction of the RequestWriter mechanism. Basically, > if a request has a RequestWriter, it is used exclusively to write the > request, and that overrides the stream mechanism completely. I haven't > chased it back to a specific ticket. > ManifoldCF's usage of SolrJ involves the creation of > ContentStreamUpdateRequests for all posts meant for Solr Cell, and the > creation of UpdateRequests for posts not meant for Solr Cell (as well as for > delete and commit requests). For our release cycle that is taking place > right now, we're shipping a modified version of HttpSolrClient that ignores > the RequestWriter when dealing with ContentStreamUpdateRequests. We > apparently cannot use multipart for all requests because on the Solr side we > get "pfountz Should not get here!" errors on the Solr side when we do, which > generate HTTP error code 500 responses. That should not happen either, in my > opinion. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-12798) Structural changes in SolrJ since version 7.0.0 have effectively disabled multipart post
[ https://issues.apache.org/jira/browse/SOLR-12798?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16626903#comment-16626903 ] Karl Wright commented on SOLR-12798: Thinking about this: I think the fix might well be to add a decent implementation of getContentStreams() to BinaryRequestWriter, and then prioritizing the use of content streams in HttpSolrClient when useMultipart is true. That would fix the basic problem, if it doesn't introduce other ones. For the ManifoldCF project's immediate release concerns, I'd have to create a ModifiedUpdateRequest class and a ModifiedBinaryRequestWriter class, if they're not locked down anyway, and use ModifiedUpdateRequest instead of UpdateRequest whenever I need to add SolrInputDocuments. I'll check out whether this would work. That makes some six SolrJ classes that ManifoldCF needs to override, however, just to get multipart post to work properly. I think it's time to make multipart post a first-class citizen for SolrJ, no? > Structural changes in SolrJ since version 7.0.0 have effectively disabled > multipart post > > > Key: SOLR-12798 > URL: https://issues.apache.org/jira/browse/SOLR-12798 > Project: Solr > Issue Type: Bug > Security Level: Public(Default Security Level. Issues are Public) > Components: SolrJ >Affects Versions: 7.4 >Reporter: Karl Wright >Priority: Major > > Project ManifoldCF uses SolrJ to post documents to Solr. When upgrading from > SolrJ 7.0.x to SolrJ 7.4, we encountered significant structural changes to > SolrJ's HttpSolrClient class that seemingly disable any use of multipart > post. This is critical because ManifoldCF's documents often contain metadata > in excess of 4K that therefore cannot be stuffed into a URL. > The changes in question seem to have been performed by Paul Noble on > 10/31/2017, with the introduction of the RequestWriter mechanism. Basically, > if a request has a RequestWriter, it is used exclusively to write the > request, and that overrides the stream mechanism completely. I haven't > chased it back to a specific ticket. > ManifoldCF's usage of SolrJ involves the creation of > ContentStreamUpdateRequests for all posts meant for Solr Cell, and the > creation of UpdateRequests for posts not meant for Solr Cell (as well as for > delete and commit requests). For our release cycle that is taking place > right now, we're shipping a modified version of HttpSolrClient that ignores > the RequestWriter when dealing with ContentStreamUpdateRequests. We > apparently cannot use multipart for all requests because on the Solr side we > get "pfountz Should not get here!" errors on the Solr side when we do, which > generate HTTP error code 500 responses. That should not happen either, in my > opinion. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Comment Edited] (SOLR-12798) Structural changes in SolrJ since version 7.0.0 have effectively disabled multipart post
[ https://issues.apache.org/jira/browse/SOLR-12798?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16626079#comment-16626079 ] Karl Wright edited comment on SOLR-12798 at 9/24/18 6:30 PM: - [~noble.paul], I have verified that the problem still exists on Solr 7.5. was (Author: kwri...@metacarta.com): [~noble.paul], I can verify that the problem still exists on Solr 7.5. > Structural changes in SolrJ since version 7.0.0 have effectively disabled > multipart post > > > Key: SOLR-12798 > URL: https://issues.apache.org/jira/browse/SOLR-12798 > Project: Solr > Issue Type: Bug > Security Level: Public(Default Security Level. Issues are Public) > Components: SolrJ >Affects Versions: 7.4 >Reporter: Karl Wright >Priority: Major > > Project ManifoldCF uses SolrJ to post documents to Solr. When upgrading from > SolrJ 7.0.x to SolrJ 7.4, we encountered significant structural changes to > SolrJ's HttpSolrClient class that seemingly disable any use of multipart > post. This is critical because ManifoldCF's documents often contain metadata > in excess of 4K that therefore cannot be stuffed into a URL. > The changes in question seem to have been performed by Paul Noble on > 10/31/2017, with the introduction of the RequestWriter mechanism. Basically, > if a request has a RequestWriter, it is used exclusively to write the > request, and that overrides the stream mechanism completely. I haven't > chased it back to a specific ticket. > ManifoldCF's usage of SolrJ involves the creation of > ContentStreamUpdateRequests for all posts meant for Solr Cell, and the > creation of UpdateRequests for posts not meant for Solr Cell (as well as for > delete and commit requests). For our release cycle that is taking place > right now, we're shipping a modified version of HttpSolrClient that ignores > the RequestWriter when dealing with ContentStreamUpdateRequests. We > apparently cannot use multipart for all requests because on the Solr side we > get "pfountz Should not get here!" errors on the Solr side when we do, which > generate HTTP error code 500 responses. That should not happen either, in my > opinion. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-12798) Structural changes in SolrJ since version 7.0.0 have effectively disabled multipart post
[ https://issues.apache.org/jira/browse/SOLR-12798?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16626079#comment-16626079 ] Karl Wright commented on SOLR-12798: [~noble.paul], I can verify that the problem still exists on Solr 7.5. > Structural changes in SolrJ since version 7.0.0 have effectively disabled > multipart post > > > Key: SOLR-12798 > URL: https://issues.apache.org/jira/browse/SOLR-12798 > Project: Solr > Issue Type: Bug > Security Level: Public(Default Security Level. Issues are Public) > Components: SolrJ >Affects Versions: 7.4 >Reporter: Karl Wright >Priority: Major > > Project ManifoldCF uses SolrJ to post documents to Solr. When upgrading from > SolrJ 7.0.x to SolrJ 7.4, we encountered significant structural changes to > SolrJ's HttpSolrClient class that seemingly disable any use of multipart > post. This is critical because ManifoldCF's documents often contain metadata > in excess of 4K that therefore cannot be stuffed into a URL. > The changes in question seem to have been performed by Paul Noble on > 10/31/2017, with the introduction of the RequestWriter mechanism. Basically, > if a request has a RequestWriter, it is used exclusively to write the > request, and that overrides the stream mechanism completely. I haven't > chased it back to a specific ticket. > ManifoldCF's usage of SolrJ involves the creation of > ContentStreamUpdateRequests for all posts meant for Solr Cell, and the > creation of UpdateRequests for posts not meant for Solr Cell (as well as for > delete and commit requests). For our release cycle that is taking place > right now, we're shipping a modified version of HttpSolrClient that ignores > the RequestWriter when dealing with ContentStreamUpdateRequests. We > apparently cannot use multipart for all requests because on the Solr side we > get "pfountz Should not get here!" errors on the Solr side when we do, which > generate HTTP error code 500 responses. That should not happen either, in my > opinion. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Comment Edited] (SOLR-12798) Structural changes in SolrJ since version 7.0.0 have effectively disabled multipart post
[ https://issues.apache.org/jira/browse/SOLR-12798?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16625809#comment-16625809 ] Karl Wright edited comment on SOLR-12798 at 9/24/18 1:58 PM: - The status is as follows: (1) I've confirmed that the RequestWriter override only permits multipart form requests for the "commit" request. "Update" or "Delete" both do not allow this pathway at all. (2) If I change the logic for all POST and PUT requests to disable the contentWriter clause, POST requests of documents work properly, but delete document requests fail, with the following exception: {code} java.lang.RuntimeException: This Should not happen at org.apache.solr.client.solrj.impl.BinaryRequestWriter.getContentStreams(BinaryRequestWriter.java:67) ~[?:?] at org.apache.manifoldcf.agents.output.solr.ModifiedHttpSolrClient.createMethod(ModifiedHttpSolrClient.java:175) ~[?:?] {code} (4) Conditionally disabling contentWriter when the request is of class ContentStreamUpdateRequest allows things to work partly. Text documents that are indexed via standard UpdateRequest do not use multipart post, however. So we need a better solution. was (Author: kwri...@metacarta.com): The status is as follows: (1) I've confirmed that the RequestWriter override only permits multipart form requests for the "commit" request. "Update" or "Delete" both do not allow this pathway at all. (2) If I change the logic for all POST and PUT requests to disable the contentWriter clause, POST requests of documents work properly, but delete document requests fail. (4) Conditionally disabling contentWriter when the request is of class ContentStreamUpdateRequest allows things to work partly. Text documents that are indexed via standard UpdateRequest do not use multipart post, however. So we need a better solution. > Structural changes in SolrJ since version 7.0.0 have effectively disabled > multipart post > > > Key: SOLR-12798 > URL: https://issues.apache.org/jira/browse/SOLR-12798 > Project: Solr > Issue Type: Bug > Security Level: Public(Default Security Level. Issues are Public) > Components: SolrJ >Affects Versions: 7.4 >Reporter: Karl Wright >Priority: Major > > Project ManifoldCF uses SolrJ to post documents to Solr. When upgrading from > SolrJ 7.0.x to SolrJ 7.4, we encountered significant structural changes to > SolrJ's HttpSolrClient class that seemingly disable any use of multipart > post. This is critical because ManifoldCF's documents often contain metadata > in excess of 4K that therefore cannot be stuffed into a URL. > The changes in question seem to have been performed by Paul Noble on > 10/31/2017, with the introduction of the RequestWriter mechanism. Basically, > if a request has a RequestWriter, it is used exclusively to write the > request, and that overrides the stream mechanism completely. I haven't > chased it back to a specific ticket. > ManifoldCF's usage of SolrJ involves the creation of > ContentStreamUpdateRequests for all posts meant for Solr Cell, and the > creation of UpdateRequests for posts not meant for Solr Cell (as well as for > delete and commit requests). For our release cycle that is taking place > right now, we're shipping a modified version of HttpSolrClient that ignores > the RequestWriter when dealing with ContentStreamUpdateRequests. We > apparently cannot use multipart for all requests because on the Solr side we > get "pfountz Should not get here!" errors on the Solr side when we do, which > generate HTTP error code 500 responses. That should not happen either, in my > opinion. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-12798) Structural changes in SolrJ since version 7.0.0 have effectively disabled multipart post
[ https://issues.apache.org/jira/browse/SOLR-12798?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16625809#comment-16625809 ] Karl Wright commented on SOLR-12798: The status is as follows: (1) I've confirmed that the RequestWriter override only permits multipart form requests for the "commit" request. "Update" or "Delete" both do not allow this pathway at all. (2) If I change the logic for all POST and PUT requests to disable the contentWriter clause, POST requests of documents work properly, but delete document requests fail. (4) Conditionally disabling contentWriter when the request is of class ContentStreamUpdateRequest allows things to work partly. Text documents that are indexed via standard UpdateRequest do not use multipart post, however. So we need a better solution. > Structural changes in SolrJ since version 7.0.0 have effectively disabled > multipart post > > > Key: SOLR-12798 > URL: https://issues.apache.org/jira/browse/SOLR-12798 > Project: Solr > Issue Type: Bug > Security Level: Public(Default Security Level. Issues are Public) > Components: SolrJ >Affects Versions: 7.4 >Reporter: Karl Wright >Priority: Major > > Project ManifoldCF uses SolrJ to post documents to Solr. When upgrading from > SolrJ 7.0.x to SolrJ 7.4, we encountered significant structural changes to > SolrJ's HttpSolrClient class that seemingly disable any use of multipart > post. This is critical because ManifoldCF's documents often contain metadata > in excess of 4K that therefore cannot be stuffed into a URL. > The changes in question seem to have been performed by Paul Noble on > 10/31/2017, with the introduction of the RequestWriter mechanism. Basically, > if a request has a RequestWriter, it is used exclusively to write the > request, and that overrides the stream mechanism completely. I haven't > chased it back to a specific ticket. > ManifoldCF's usage of SolrJ involves the creation of > ContentStreamUpdateRequests for all posts meant for Solr Cell, and the > creation of UpdateRequests for posts not meant for Solr Cell (as well as for > delete and commit requests). For our release cycle that is taking place > right now, we're shipping a modified version of HttpSolrClient that ignores > the RequestWriter when dealing with ContentStreamUpdateRequests. We > apparently cannot use multipart for all requests because on the Solr side we > get "pfountz Should not get here!" errors on the Solr side when we do, which > generate HTTP error code 500 responses. That should not happen either, in my > opinion. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Issue Comment Deleted] (SOLR-12798) Structural changes in SolrJ since version 7.0.0 have effectively disabled multipart post
[ https://issues.apache.org/jira/browse/SOLR-12798?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karl Wright updated SOLR-12798: --- Comment: was deleted (was: We're researching the actual issue that is blocking release. It seems that deleting documents using a Solr Cloud installation may not be working; for each document, we're seeing a 400 error with the following message: {code} Error from server at http://localhost:8983/solr/FileShare_shard1_replica_n1: missing content stream: Error from server at http://localhost:8983/solr/FileShare_shard1_replica_n1: missing content stream {code} Furthermore, after checking the Solr index, none of the documents have been removed. This is obviously severe and we're trying now to confirm that this happens without our modifications to HttpSolrClient. ) > Structural changes in SolrJ since version 7.0.0 have effectively disabled > multipart post > > > Key: SOLR-12798 > URL: https://issues.apache.org/jira/browse/SOLR-12798 > Project: Solr > Issue Type: Bug > Security Level: Public(Default Security Level. Issues are Public) > Components: SolrJ >Affects Versions: 7.4 >Reporter: Karl Wright >Priority: Major > > Project ManifoldCF uses SolrJ to post documents to Solr. When upgrading from > SolrJ 7.0.x to SolrJ 7.4, we encountered significant structural changes to > SolrJ's HttpSolrClient class that seemingly disable any use of multipart > post. This is critical because ManifoldCF's documents often contain metadata > in excess of 4K that therefore cannot be stuffed into a URL. > The changes in question seem to have been performed by Paul Noble on > 10/31/2017, with the introduction of the RequestWriter mechanism. Basically, > if a request has a RequestWriter, it is used exclusively to write the > request, and that overrides the stream mechanism completely. I haven't > chased it back to a specific ticket. > ManifoldCF's usage of SolrJ involves the creation of > ContentStreamUpdateRequests for all posts meant for Solr Cell, and the > creation of UpdateRequests for posts not meant for Solr Cell (as well as for > delete and commit requests). For our release cycle that is taking place > right now, we're shipping a modified version of HttpSolrClient that ignores > the RequestWriter when dealing with ContentStreamUpdateRequests. We > apparently cannot use multipart for all requests because on the Solr side we > get "pfountz Should not get here!" errors on the Solr side when we do, which > generate HTTP error code 500 responses. That should not happen either, in my > opinion. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-12798) Structural changes in SolrJ since version 7.0.0 have effectively disabled multipart post
[ https://issues.apache.org/jira/browse/SOLR-12798?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16625558#comment-16625558 ] Karl Wright commented on SOLR-12798: We're researching the actual issue that is blocking release. It seems that deleting documents using a Solr Cloud installation may not be working; for each document, we're seeing a 400 error with the following message: Error from server at http://localhost:8983/solr/FileShare_shard1_replica_n1: missing content stream: Error from server at http://localhost:8983/solr/FileShare_shard1_replica_n1: missing content stream Furthermore, after checking the Solr index, none of the documents have been removed. This is obviously severe and we're trying now to confirm that this happens without our modifications to HttpSolrClient. > Structural changes in SolrJ since version 7.0.0 have effectively disabled > multipart post > > > Key: SOLR-12798 > URL: https://issues.apache.org/jira/browse/SOLR-12798 > Project: Solr > Issue Type: Bug > Security Level: Public(Default Security Level. Issues are Public) > Components: SolrJ >Affects Versions: 7.4 >Reporter: Karl Wright >Priority: Major > > Project ManifoldCF uses SolrJ to post documents to Solr. When upgrading from > SolrJ 7.0.x to SolrJ 7.4, we encountered significant structural changes to > SolrJ's HttpSolrClient class that seemingly disable any use of multipart > post. This is critical because ManifoldCF's documents often contain metadata > in excess of 4K that therefore cannot be stuffed into a URL. > The changes in question seem to have been performed by Paul Noble on > 10/31/2017, with the introduction of the RequestWriter mechanism. Basically, > if a request has a RequestWriter, it is used exclusively to write the > request, and that overrides the stream mechanism completely. I haven't > chased it back to a specific ticket. > ManifoldCF's usage of SolrJ involves the creation of > ContentStreamUpdateRequests for all posts meant for Solr Cell, and the > creation of UpdateRequests for posts not meant for Solr Cell (as well as for > delete and commit requests). For our release cycle that is taking place > right now, we're shipping a modified version of HttpSolrClient that ignores > the RequestWriter when dealing with ContentStreamUpdateRequests. We > apparently cannot use multipart for all requests because on the Solr side we > get "pfountz Should not get here!" errors on the Solr side when we do, which > generate HTTP error code 500 responses. That should not happen either, in my > opinion. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Comment Edited] (SOLR-12798) Structural changes in SolrJ since version 7.0.0 have effectively disabled multipart post
[ https://issues.apache.org/jira/browse/SOLR-12798?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16625558#comment-16625558 ] Karl Wright edited comment on SOLR-12798 at 9/24/18 9:08 AM: - We're researching the actual issue that is blocking release. It seems that deleting documents using a Solr Cloud installation may not be working; for each document, we're seeing a 400 error with the following message: {code} Error from server at http://localhost:8983/solr/FileShare_shard1_replica_n1: missing content stream: Error from server at http://localhost:8983/solr/FileShare_shard1_replica_n1: missing content stream {code} Furthermore, after checking the Solr index, none of the documents have been removed. This is obviously severe and we're trying now to confirm that this happens without our modifications to HttpSolrClient. was (Author: kwri...@metacarta.com): We're researching the actual issue that is blocking release. It seems that deleting documents using a Solr Cloud installation may not be working; for each document, we're seeing a 400 error with the following message: Error from server at http://localhost:8983/solr/FileShare_shard1_replica_n1: missing content stream: Error from server at http://localhost:8983/solr/FileShare_shard1_replica_n1: missing content stream Furthermore, after checking the Solr index, none of the documents have been removed. This is obviously severe and we're trying now to confirm that this happens without our modifications to HttpSolrClient. > Structural changes in SolrJ since version 7.0.0 have effectively disabled > multipart post > > > Key: SOLR-12798 > URL: https://issues.apache.org/jira/browse/SOLR-12798 > Project: Solr > Issue Type: Bug > Security Level: Public(Default Security Level. Issues are Public) > Components: SolrJ >Affects Versions: 7.4 >Reporter: Karl Wright >Priority: Major > > Project ManifoldCF uses SolrJ to post documents to Solr. When upgrading from > SolrJ 7.0.x to SolrJ 7.4, we encountered significant structural changes to > SolrJ's HttpSolrClient class that seemingly disable any use of multipart > post. This is critical because ManifoldCF's documents often contain metadata > in excess of 4K that therefore cannot be stuffed into a URL. > The changes in question seem to have been performed by Paul Noble on > 10/31/2017, with the introduction of the RequestWriter mechanism. Basically, > if a request has a RequestWriter, it is used exclusively to write the > request, and that overrides the stream mechanism completely. I haven't > chased it back to a specific ticket. > ManifoldCF's usage of SolrJ involves the creation of > ContentStreamUpdateRequests for all posts meant for Solr Cell, and the > creation of UpdateRequests for posts not meant for Solr Cell (as well as for > delete and commit requests). For our release cycle that is taking place > right now, we're shipping a modified version of HttpSolrClient that ignores > the RequestWriter when dealing with ContentStreamUpdateRequests. We > apparently cannot use multipart for all requests because on the Solr side we > get "pfountz Should not get here!" errors on the Solr side when we do, which > generate HTTP error code 500 responses. That should not happen either, in my > opinion. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-12798) Structural changes in SolrJ since version 7.0.0 have effectively disabled multipart post
[ https://issues.apache.org/jira/browse/SOLR-12798?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16625544#comment-16625544 ] Karl Wright commented on SOLR-12798: [~noble.paul], any help would be welcome. We're in a ManifoldCF release cycle now, and SolrJ issues are blocking it. > Structural changes in SolrJ since version 7.0.0 have effectively disabled > multipart post > > > Key: SOLR-12798 > URL: https://issues.apache.org/jira/browse/SOLR-12798 > Project: Solr > Issue Type: Bug > Security Level: Public(Default Security Level. Issues are Public) > Components: SolrJ >Affects Versions: 7.4 >Reporter: Karl Wright >Priority: Major > > Project ManifoldCF uses SolrJ to post documents to Solr. When upgrading from > SolrJ 7.0.x to SolrJ 7.4, we encountered significant structural changes to > SolrJ's HttpSolrClient class that seemingly disable any use of multipart > post. This is critical because ManifoldCF's documents often contain metadata > in excess of 4K that therefore cannot be stuffed into a URL. > The changes in question seem to have been performed by Paul Noble on > 10/31/2017, with the introduction of the RequestWriter mechanism. Basically, > if a request has a RequestWriter, it is used exclusively to write the > request, and that overrides the stream mechanism completely. I haven't > chased it back to a specific ticket. > ManifoldCF's usage of SolrJ involves the creation of > ContentStreamUpdateRequests for all posts meant for Solr Cell, and the > creation of UpdateRequests for posts not meant for Solr Cell (as well as for > delete and commit requests). For our release cycle that is taking place > right now, we're shipping a modified version of HttpSolrClient that ignores > the RequestWriter when dealing with ContentStreamUpdateRequests. We > apparently cannot use multipart for all requests because on the Solr side we > get "pfountz Should not get here!" errors on the Solr side when we do, which > generate HTTP error code 500 responses. That should not happen either, in my > opinion. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org