Re: Survey on ManagedResources feature
So, it's not very different from directly reading a file from ZK? what benefit do you get by using the ManagedResourceStorage? On Sun, Aug 16, 2020 at 7:08 PM Matthias Krueger wrote: > > > In a custom SolrRequestHandler#handleRequest something like this: > > final ManagedResourceStorage.StorageIO storageIO = > ManagedResourceStorage.newStorageIO(core.getCoreDescriptor().getCollectionName(), > resourceLoader, new NamedList<>()); > > And then using > > storageIO.openOutputStream(resourceName) > > to store some (well-known) resources. > > Matt > > > On 15.08.20 11:38, Noble Paul wrote: > >> I use MangedResource#StorageIO and its implementations as a convenient way > >> to abstract away the underlying config storage when creating plugins that > >> need to support both, SolrCloud and Solr Standalone. > > Can you give us some more details on how you use it? > > > > On Sat, Aug 15, 2020 at 7:32 PM Noble Paul wrote: > >>> As authentication is plugged into the SolrDispatchFilter I would assume > >>> that you would need to be authenticated to read/write Managed Resources > >> I'm talking about the authorization plugins > >> > >> On Fri, Aug 14, 2020 at 10:20 PM Matthias Krueger wrote: > >>> > >>> As authentication is plugged into the SolrDispatchFilter I would assume > >>> that you would need to be authenticated to read/write Managed Resources > >>> but no authorization is checked (i.e. any authenticated user can > >>> read/write them), correct? > >>> > >>> Anyway, I came across Managed Resources in at least two scenarios: > >>> > >>> The LTR plugin is using them for updating model/features. > >>> I use MangedResource#StorageIO and its implementations as a convenient > >>> way to abstract away the underlying config storage when creating plugins > >>> that need to support both, SolrCloud and Solr Standalone. > >>> > >>> IMO an abstraction that allows distributing configuration (ML models, > >>> configuration snippets, external file fields...) that exceeds the typical > >>> ZK size limits to SolrCloud while also supporting Solr Standalone would > >>> be nice to have. > >>> > >>> Matt > >>> > >>> > >>> On 12.08.20 02:08, Noble Paul wrote: > >>> > >>> The end point is served by restlet. So, your rules are not going to be > >>> honored. The rules work only if it is served by a Solr request handler > >>> > >>> On Wed, Aug 12, 2020, 12:46 AM Jason Gerlowski > >>> wrote: > Hey Noble, > > Can you explain what you mean when you say it's not secured? Just for > those of us who haven't been following the discussion so far? On the > surface of things users taking advantage of our RuleBasedAuth plugin > can secure this API like they can any other HTTP API. Or are you > talking about some other security aspect here? > > Jason > > On Tue, Aug 11, 2020 at 9:55 AM Noble Paul wrote: > > Hi all, > > The end-point for Managed resources is not secured. So it needs to be > > fixed/eliminated. > > > > I would like to know what is the level of adoption for that feature > > and if it is a critical feature for users. > > > > Another possibility is to offer a replacement for the feature using a > > different API > > > > Your feedback will help us decide on what a potential solution should be > > > > -- > > - > > Noble Paul > >> > >> > >> -- > >> - > >> Noble Paul > > > > > > - > To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org > For additional commands, e-mail: dev-h...@lucene.apache.org > -- - Noble Paul - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: When zero offsets are not bad - a.k.a. multi-token synonyms yet again
Hi Mike, I'm sorry, the problem all the time is inside related to a word-delimiter filter factory. This is embarrassing but I have to admit publicly and self-flagellate. A word-delimiter filter is used to split tokens, these then are used to find multi-token synonyms (hence the connection). In my desire to simplify, I have omitted that detail while writing my first email. I went to generate the stack trace: ``` assertU(adoc("id", "603", "bibcode", "xx603", "title", "THE HUBBLE constant: a summary of the HUBBLE SPACE TELESCOPE program"));``` stage:indexer term=xx603 pos=1 type=word offsetStart=0 offsetEnd=13 stage:indexer term=acr::the pos=1 type=ACRONYM offsetStart=0 offsetEnd=3 stage:indexer term=hubble pos=1 type=word offsetStart=4 offsetEnd=10 stage:indexer term=acr::hubble pos=0 type=ACRONYM offsetStart=4 offsetEnd=10 stage:indexer term=constant pos=1 type=word offsetStart=11 offsetEnd=20 stage:indexer term=summary pos=1 type=word offsetStart=23 offsetEnd=30 stage:indexer term=hubble pos=1 type=word offsetStart=38 offsetEnd=44 stage:indexer term=syn::hubble space telescope pos=0 type=SYNONYM offsetStart=38 offsetEnd=60 stage:indexer term=syn::hst pos=0 type=SYNONYM offsetStart=38 offsetEnd=60 stage:indexer term=space pos=1 type=word offsetStart=45 offsetEnd=50 stage:indexer term=telescope pos=1 type=word offsetStart=51 offsetEnd=60 stage:indexer term=program pos=1 type=word offsetStart=61 offsetEnd=68 that worked, only the next one failed: ```assertU(adoc("id", "605", "bibcode", "xx604", "title", "MIT and anti de sitter space-time"));``` stage:indexer term=xx604 pos=1 type=word offsetStart=0 offsetEnd=13 stage:indexer term=mit pos=1 type=word offsetStart=0 offsetEnd=3 stage:indexer term=acr::mit pos=0 type=ACRONYM offsetStart=0 offsetEnd=3 stage:indexer term=syn::massachusetts institute of technology pos=0 type=SYNONYM offsetStart=0 offsetEnd=3 stage:indexer term=syn::mit pos=0 type=SYNONYM offsetStart=0 offsetEnd=3 stage:indexer term=anti pos=1 type=word offsetStart=8 offsetEnd=12 stage:indexer term=syn::ads pos=0 type=SYNONYM offsetStart=8 offsetEnd=28 stage:indexer term=syn::anti de sitter space pos=0 type=SYNONYM offsetStart=8 offsetEnd=28 stage:indexer term=syn::antidesitter spacetime pos=0 type=SYNONYM offsetStart=8 offsetEnd=28 stage:indexer term=de pos=1 type=word offsetStart=13 offsetEnd=15 stage:indexer term=sitter pos=1 type=word offsetStart=16 offsetEnd=22 stage:indexer term=space pos=1 type=word offsetStart=23 offsetEnd=28 stage:indexer term=time pos=1 type=word offsetStart=29 offsetEnd=33 stage:indexer term=spacetime pos=0 type=word offsetStart=23 offsetEnd=33 ```325677 ERROR (TEST-TestAdsabsTypeFulltextParsing.testNoSynChain-seed#[ADFAB495DA8F6F40]) [] o.a.s.h.RequestHandlerBase org.apache.solr.common.SolrException: Exception writing document id 605 to the index; possible analysis error: startOffset must be non-negative, and endOffset must be >= startOffset, and offsets must not go backwards startOffset=23,endOffset=33,lastStartOffset=29 for field 'title' at org.apache.solr.update.DirectUpdateHandler2.addDoc(DirectUpdateHandler2.java:242) at org.apache.solr.update.processor.RunUpdateProcessor.processAdd(RunUpdateProcessorFactory.java:67) at org.apache.solr.update.processor.UpdateRequestProcessor.processAdd(UpdateRequestProcessor.java:55) at org.apache.solr.update.processor.DistributedUpdateProcessor.doLocalAdd(DistributedUpdateProcessor.java:1002) at org.apache.solr.update.processor.DistributedUpdateProcessor.doVersionAdd(DistributedUpdateProcessor.java:1233) at org.apache.solr.update.processor.DistributedUpdateProcessor.lambda$2(DistributedUpdateProcessor.java:1082) at org.apache.solr.update.VersionBucket.runWithLock(VersionBucket.java:50) at org.apache.solr.update.processor.DistributedUpdateProcessor.versionAdd(DistributedUpdateProcessor.java:1082) at org.apache.solr.update.processor.DistributedUpdateProcessor.processAdd(DistributedUpdateProcessor.java:694) at org.apache.solr.update.processor.LogUpdateProcessorFactory$LogUpdateProcessor.processAdd(LogUpdateProcessorFactory.java:103) at org.apache.solr.handler.loader.XMLLoader.processUpdate(XMLLoader.java:261) at org.apache.solr.handler.loader.XMLLoader.load(XMLLoader.java:188) at org.apache.solr.handler.UpdateRequestHandler$1.load(UpdateRequestHandler.java:97) at org.apache.solr.handler.ContentStreamHandlerBase.handleRequestBody(ContentStreamHandlerBase.java:68) at org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:199) at org.apache.solr.core.SolrCore.execute(SolrCore.java:2551) at org.apache.solr.servlet.DirectSolrConnection.request(DirectSolrConnection.java:125) at org.apache.solr.util.TestHarness.update(TestHarness.java:285) at org.apache.solr.util.BaseTestHarness.checkUpdateStatus(BaseTestHarness.java:274) at org.apache.solr.util.BaseTestHarness.validateUpdate(BaseTestHarness.java:244) at org.apache.solr.SolrTestCaseJ4.checkUp
Atomic updates, copyField and stored=true
It _finally_ occurred to me to ask why we have the restriction that the destination of a copyField must have stored=false. I understand what currently happens when that’s the case, you get repeats. What I wondered is why we can’t detect that a field is the destination of a copyField and _not_ pull the stored values out of it during atomic updates? Or do we run afoul of things in tlog retrieval or RTG? Is this a silly idea or should I raise a JIRA? - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: Migrating to Cloudbees
Thank you Cassandra, I quickly looked into config. Sems easy all together. I can possibly also set this up for Javadocs. The good thing with Javadocs is that we can better configure linking between Solr and Lucene, so this is a good thing. I will try to set something up, if I have some time. Uwe Am August 18, 2020 7:54:39 PM UTC schrieb Cassandra Targett : >Follow-up on this - we fixed the Solr Ref Guide builds last week, but >there was an outstanding issue which was the Content Security Policy on >Cloudbees is too stringent to display the Ref Guide’s CSS and JS. It >blocked all the content basically, rendering them unreadable. > >Infra helped us straighten it out by setting up the ability for us to >push the artifacts of Ref Guide builds to a new server they’ve recently >set up to host nightly builds. I’ve updated all the Ref Guide jobs to >do that and fixed their descriptions to point to the new locations. You >can find them at https://nightlies.apache.org/Lucene/. > >The Javadocs for both Lucene and Solr also suffer from the same limited >CSP, but the Javadocs seem to be able to mostly recover from it. It is >possible to push them to the nightlies server for the full JS-enabled >experience if we choose. > >Infra is also quite open (enthusiastic?) for people to use this server, >so if there is any interest in pushing other build artifacts out to it >as a regular place to get pre-release builds, we’re welcome to do so. I >can help, or you can look at one of the Ref Guide jobs for an example. > >Cassandra >On Aug 7, 2020, 12:17 PM -0500, Ishan Chattopadhyaya >, wrote: >> Thanks for your work, Uwe. I would love to run a public Jenkins >server soon (maybe be September), would like to try out your scripts >:-) >> >> > On Fri, Aug 7, 2020 at 10:12 PM David Smiley >wrote: >> > > Sweet! Thanks Uwe! >> > > ~ David Smiley >> > > Apache Lucene/Solr Search Developer >> > > http://www.linkedin.com/in/davidwsmiley >> > > >> > > >> > > > On Thu, Aug 6, 2020 at 5:52 PM Uwe Schindler >wrote: >> > > > > Thanks Erick! >> > > > > >> > > > > I hope the remaining issues sort out quite soon. >> > > > > >> > > > > For the release managers: As I did a more scripted, automatic >migration using the Jenkins REST API (otherwise the 50 jobs we have >would have been a desaster to migrate), I already have a plan to reuse >that script to allow the release manager to create clones of all >"master" jobs, preconfigured for the release branch. All you need is a >Lucene PMC status and a Jenkins API Token and then you will be able to >start a script who creates all release branch jobs in a few seconds 😊 >> > > > > >> > > > > Uwe >> > > > > >> > > > > - >> > > > > Uwe Schindler >> > > > > Achterdiek 19, D-28357 Bremen >> > > > > https://www.thetaphi.de >> > > > > eMail: u...@thetaphi.de >> > > > > >> > > > > > -Original Message- >> > > > > > From: Erick Erickson >> > > > > > Sent: Thursday, August 6, 2020 11:39 PM >> > > > > > To: dev@lucene.apache.org >> > > > > > Subject: Migrating to Cloudbees >> > > > > > >> > > > > > If nobody has expressed their _extreme_ gratitude to Uwe, >infra (and helpers?) >> > > > > > for the migration, I hereby rectify that!! >> > > > > > >- >> > > > > > To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org >> > > > > > For additional commands, e-mail: dev-h...@lucene.apache.org >> > > > > >> > > > > >> > > > > >- >> > > > > To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org >> > > > > For additional commands, e-mail: dev-h...@lucene.apache.org >> > > > > -- Uwe Schindler Achterdiek 19, 28357 Bremen https://www.thetaphi.de
Re: Migrating to Cloudbees
Follow-up on this - we fixed the Solr Ref Guide builds last week, but there was an outstanding issue which was the Content Security Policy on Cloudbees is too stringent to display the Ref Guide’s CSS and JS. It blocked all the content basically, rendering them unreadable. Infra helped us straighten it out by setting up the ability for us to push the artifacts of Ref Guide builds to a new server they’ve recently set up to host nightly builds. I’ve updated all the Ref Guide jobs to do that and fixed their descriptions to point to the new locations. You can find them at https://nightlies.apache.org/Lucene/. The Javadocs for both Lucene and Solr also suffer from the same limited CSP, but the Javadocs seem to be able to mostly recover from it. It is possible to push them to the nightlies server for the full JS-enabled experience if we choose. Infra is also quite open (enthusiastic?) for people to use this server, so if there is any interest in pushing other build artifacts out to it as a regular place to get pre-release builds, we’re welcome to do so. I can help, or you can look at one of the Ref Guide jobs for an example. Cassandra On Aug 7, 2020, 12:17 PM -0500, Ishan Chattopadhyaya , wrote: > Thanks for your work, Uwe. I would love to run a public Jenkins server soon > (maybe be September), would like to try out your scripts :-) > > > On Fri, Aug 7, 2020 at 10:12 PM David Smiley wrote: > > > Sweet! Thanks Uwe! > > > ~ David Smiley > > > Apache Lucene/Solr Search Developer > > > http://www.linkedin.com/in/davidwsmiley > > > > > > > > > > On Thu, Aug 6, 2020 at 5:52 PM Uwe Schindler wrote: > > > > > Thanks Erick! > > > > > > > > > > I hope the remaining issues sort out quite soon. > > > > > > > > > > For the release managers: As I did a more scripted, automatic > > > > > migration using the Jenkins REST API (otherwise the 50 jobs we have > > > > > would have been a desaster to migrate), I already have a plan to > > > > > reuse that script to allow the release manager to create clones of > > > > > all "master" jobs, preconfigured for the release branch. All you need > > > > > is a Lucene PMC status and a Jenkins API Token and then you will be > > > > > able to start a script who creates all release branch jobs in a few > > > > > seconds 😊 > > > > > > > > > > Uwe > > > > > > > > > > - > > > > > Uwe Schindler > > > > > Achterdiek 19, D-28357 Bremen > > > > > https://www.thetaphi.de > > > > > eMail: u...@thetaphi.de > > > > > > > > > > > -Original Message- > > > > > > From: Erick Erickson > > > > > > Sent: Thursday, August 6, 2020 11:39 PM > > > > > > To: dev@lucene.apache.org > > > > > > Subject: Migrating to Cloudbees > > > > > > > > > > > > If nobody has expressed their _extreme_ gratitude to Uwe, infra > > > > > > (and helpers?) > > > > > > for the migration, I hereby rectify that!! > > > > > > - > > > > > > To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org > > > > > > For additional commands, e-mail: dev-h...@lucene.apache.org > > > > > > > > > > > > > > > - > > > > > To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org > > > > > For additional commands, e-mail: dev-h...@lucene.apache.org > > > > >
2020-08 Committer virtual meeting
Hello fellow committers, I'd like to organize another virtual Lucene/Solr committer meeting this month. I created a meeting notes page in confluence here: https://cwiki.apache.org/confluence/display/LUCENE/2020-08+Committer+meeting It has some topics I'd like to talk about, some copied from last month that might be worth following up on, and I'm hoping others might add to the tentative agenda as well. As usual there are many topics to discuss. I suppose if we have these meetings more often, I'll be less compelled to raise seemingly all topics. When exactly is this?: Perhaps next Thursday or maybe later. I'm using a "Doodle poll" to determine an optimal time slot. For the link to the poll, go to the ASF Slack, #lucene-dev or #solr-dev channel, and you will see it. You could also email me directly for it. For this virtual committer meeting and future ones: - This is in the spirit of committer meetings co-located with conferences. ASF policy says that no "decisions" can be made in such a venue. We make decisions on this dev list and indirectly via JIRA out in the open and with the opportunity for anyone to comment. - Who: Committer-only or by invitation - Video chat with option of audio dial-in. This time I will use Google Hangout. - Recorded for those invited only. I'll dispose of the recording a week after. The intention is for those who cannot be there due to a scheduling conflict to see/hear what was said. I have the ability to do this recording via Salesforce's G-Suite subscription. - Published notes: I (or someone) will take written meeting notes that are ultimately published for anyone to see (not restricted to those invited). They will be transmitted to the dev list. ~ David Smiley Apache Lucene/Solr Search Developer http://www.linkedin.com/in/davidwsmiley