Re: [PR] SOLR-17632: Text to Vector Update Request Processor [solr]
alessandrobenedetti commented on PR #3151: URL: https://github.com/apache/solr/pull/3151#issuecomment-2721866606 Merged! Thanks, everybody, for the help! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For additional commands, e-mail: issues-h...@solr.apache.org
Re: [PR] SOLR-17632: Text to Vector Update Request Processor [solr]
alessandrobenedetti merged PR #3151: URL: https://github.com/apache/solr/pull/3151 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For additional commands, e-mail: issues-h...@solr.apache.org
Re: [PR] SOLR-17632: Text to Vector Update Request Processor [solr]
alessandrobenedetti commented on PR #3151: URL: https://github.com/apache/solr/pull/3151#issuecomment-2714076350 Ok, no updates, comments or help in the last three weeks, so at the end of the week, I'll proceed fixing the tests and merging. any help with the test clean up is still welcome! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For additional commands, e-mail: issues-h...@solr.apache.org
Re: [PR] SOLR-17632: Text to Vector Update Request Processor [solr]
alessandrobenedetti commented on PR #3151: URL: https://github.com/apache/solr/pull/3151#issuecomment-2672286942 I've done another round of cleanup, documentation and improvements. There's still the annoying problem with the beforeClass/afterClass and tests leftovers. I'll be back in a couple of weeks, feel free to contribute any suggestions! I believe we are quite close to finalise this contribution, thanks for all the help! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For additional commands, e-mail: issues-h...@solr.apache.org
Re: [PR] SOLR-17632: Text to Vector Update Request Processor [solr]
alessandrobenedetti commented on code in PR #3151: URL: https://github.com/apache/solr/pull/3151#discussion_r1964100880 ## solr/modules/llm/src/test/org/apache/solr/llm/textvectorisation/update/processor/TextToVectorUpdateProcessorFactoryTest.java: ## @@ -0,0 +1,130 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.solr.llm.textvectorisation.update.processor; + +import org.apache.solr.common.SolrException; +import org.apache.solr.common.params.MultiMapSolrParams; +import org.apache.solr.common.params.SolrParams; +import org.apache.solr.common.util.NamedList; +import org.apache.solr.llm.TestLlmBase; +import org.apache.solr.request.SolrQueryRequestBase; +import org.junit.AfterClass; +import org.junit.BeforeClass; +import org.junit.Test; + +import java.util.HashMap; +import java.util.Map; + + +public class TextToVectorUpdateProcessorFactoryTest extends TestLlmBase { + private TextToVectorUpdateProcessorFactory factoryToTest = + new TextToVectorUpdateProcessorFactory(); + private NamedList args = new NamedList<>(); + + @BeforeClass + public static void initArgs() throws Exception { +setupTest("solrconfig-llm.xml", "schema.xml", false, false); + } + + @AfterClass + public static void after() throws Exception { +afterTest(); + } + + @Test + public void init_fullArgs_shouldInitAllParams() { +args.add("inputField", "_text_"); +args.add("outputField", "vector"); +args.add("model", "model1"); +factoryToTest.init(args); + +assertEquals("_text_", factoryToTest.getInputField()); +assertEquals("vector", factoryToTest.getOutputField()); +assertEquals("model1", factoryToTest.getModelName()); + } + + @Test + public void init_nullInputField_shouldThrowExceptionWithDetailedMessage() { +args.add("outputField", "vector"); +args.add("model", "model1"); + +SolrException e = assertThrows(SolrException.class, () -> factoryToTest.init(args)); +assertEquals("Missing required parameter: inputField", e.getMessage()); + } + + @Test + public void init_nullOutputField_shouldThrowExceptionWithDetailedMessage() { +args.add("inputField", "_text_"); +args.add("model", "model1"); + +SolrException e = assertThrows(SolrException.class, () -> factoryToTest.init(args)); +assertEquals("Missing required parameter: outputField", e.getMessage()); + } + + @Test + public void init_nullModel_shouldThrowExceptionWithDetailedMessage() { +args.add("inputField", "_text_"); +args.add("outputField", "vector"); + +SolrException e = assertThrows(SolrException.class, () -> factoryToTest.init(args)); +assertEquals("Missing required parameter: model", e.getMessage()); + } + + /* Following tests depends on a real solr schema */ + @Test + public void init_notExistentOutputField_shouldThrowExceptionWithDetailedMessage() throws Exception { +args.add("inputField", "_text_"); +args.add("outputField", "notExistentOutput"); +args.add("model", "model1"); + +Map params = new HashMap<>(); +MultiMapSolrParams mmparams = new MultiMapSolrParams(params); Review Comment: Thanks! just pushed it! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For additional commands, e-mail: issues-h...@solr.apache.org
Re: [PR] SOLR-17632: Text to Vector Update Request Processor [solr]
alessandrobenedetti commented on code in PR #3151: URL: https://github.com/apache/solr/pull/3151#discussion_r1964094479 ## solr/modules/llm/src/test/org/apache/solr/llm/textvectorisation/update/processor/TextToVectorUpdateProcessorFactoryTest.java: ## @@ -0,0 +1,130 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.solr.llm.textvectorisation.update.processor; + +import org.apache.solr.common.SolrException; +import org.apache.solr.common.params.MultiMapSolrParams; +import org.apache.solr.common.params.SolrParams; +import org.apache.solr.common.util.NamedList; +import org.apache.solr.llm.TestLlmBase; +import org.apache.solr.request.SolrQueryRequestBase; +import org.junit.AfterClass; +import org.junit.BeforeClass; +import org.junit.Test; + +import java.util.HashMap; +import java.util.Map; + + +public class TextToVectorUpdateProcessorFactoryTest extends TestLlmBase { + private TextToVectorUpdateProcessorFactory factoryToTest = + new TextToVectorUpdateProcessorFactory(); + private NamedList args = new NamedList<>(); Review Comment: Agreed and pushed the changes! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For additional commands, e-mail: issues-h...@solr.apache.org
Re: [PR] SOLR-17632: Text to Vector Update Request Processor [solr]
alessandrobenedetti commented on code in PR #3151: URL: https://github.com/apache/solr/pull/3151#discussion_r1964064479 ## solr/modules/llm/src/java/org/apache/solr/llm/texttovector/update/processor/TextToVectorUpdateProcessor.java: ## @@ -0,0 +1,100 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.solr.llm.texttovector.update.processor; + +import org.apache.solr.common.SolrException; +import org.apache.solr.common.SolrInputDocument; +import org.apache.solr.common.SolrInputField; +import org.apache.solr.llm.texttovector.model.SolrTextToVectorModel; +import org.apache.solr.llm.texttovector.store.rest.ManagedTextToVectorModelStore; +import org.apache.solr.request.SolrQueryRequest; +import org.apache.solr.update.AddUpdateCommand; +import org.apache.solr.update.processor.UpdateRequestProcessor; +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +import java.io.IOException; +import java.lang.invoke.MethodHandles; +import java.util.ArrayList; +import java.util.List; + + +class TextToVectorUpdateProcessor extends UpdateRequestProcessor { +private static final Logger log = LoggerFactory.getLogger(MethodHandles.lookup().lookupClass()); + +private final String inputField; +private final String outputField; +private final String model; +private SolrTextToVectorModel textToVector; +private ManagedTextToVectorModelStore modelStore = null; + +public TextToVectorUpdateProcessor( +String inputField, +String outputField, +String model, +SolrQueryRequest req, +UpdateRequestProcessor next) { +super(next); +this.inputField = inputField; +this.outputField = outputField; +this.model = model; +this.modelStore = ManagedTextToVectorModelStore.getManagedModelStore(req.getCore()); +} + +/** + * @param cmd the update command in input containing the Document to process + * @throws IOException If there is a low-level I/O error + */ +@Override +public void processAdd(AddUpdateCommand cmd) throws IOException { +this.textToVector = modelStore.getModel(model); +if (textToVector == null) { +throw new SolrException( +SolrException.ErrorCode.BAD_REQUEST, +"The model requested '" ++ model ++ "' can't be found in the store: " ++ ManagedTextToVectorModelStore.REST_END_POINT); +} + +SolrInputDocument doc = cmd.getSolrInputDocument(); +SolrInputField inputFieldContent = doc.get(inputField); +if (!isNullOrEmpty(inputFieldContent, doc, inputField)) { +String textToVectorise = inputFieldContent.getValue().toString();//add null checks and +float[] vector = textToVector.vectorise(textToVectorise); +List vectorAsList = new ArrayList(vector.length); +for (float f : vector) { +vectorAsList.add(f); +} +doc.addField(outputField, vectorAsList); +} +super.processAdd(cmd); +} + +protected boolean isNullOrEmpty(SolrInputField inputFieldContent, SolrInputDocument doc, String fieldName) { Review Comment: addressed, resolving this, feel free to open a new comment if necessary -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For additional commands, e-mail: issues-h...@solr.apache.org
Re: [PR] SOLR-17632: Text to Vector Update Request Processor [solr]
alessandrobenedetti commented on code in PR #3151: URL: https://github.com/apache/solr/pull/3151#discussion_r1964083376 ## solr/modules/llm/src/test/org/apache/solr/llm/textvectorisation/search/TextToVectorQParserTest.java: ## @@ -29,6 +30,11 @@ public static void init() throws Exception { loadModel("dummy-model.json"); } + @AfterClass + public static void cleanup() throws Exception { +afterTest(); + } Review Comment: That's the part that is giving me headache and nasty leftovers in tests (as detailed in the initial comment). The reason for that was to index once before all the tests, do the tests and then do the cleanup at the end. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For additional commands, e-mail: issues-h...@solr.apache.org
Re: [PR] SOLR-17632: Text to Vector Update Request Processor [solr]
alessandrobenedetti commented on code in PR #3151: URL: https://github.com/apache/solr/pull/3151#discussion_r1964081310 ## solr/modules/llm/src/java/org/apache/solr/llm/textvectorisation/update/processor/TextToVectorUpdateProcessorFactory.java: ## @@ -0,0 +1,105 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.solr.llm.textvectorisation.update.processor; + +import org.apache.solr.common.SolrException; +import org.apache.solr.common.params.RequiredSolrParams; +import org.apache.solr.common.params.SolrParams; +import org.apache.solr.common.util.NamedList; +import org.apache.solr.llm.textvectorisation.model.SolrTextToVectorModel; +import org.apache.solr.llm.textvectorisation.store.rest.ManagedTextToVectorModelStore; +import org.apache.solr.request.SolrQueryRequest; +import org.apache.solr.response.SolrQueryResponse; +import org.apache.solr.schema.DenseVectorField; +import org.apache.solr.schema.FieldType; +import org.apache.solr.schema.IndexSchema; +import org.apache.solr.schema.SchemaField; +import org.apache.solr.update.processor.UpdateRequestProcessor; +import org.apache.solr.update.processor.UpdateRequestProcessorFactory; + +/** + * This class implements an UpdateProcessorFactory for the Text To Vector Update Processor. + */ +public class TextToVectorUpdateProcessorFactory extends UpdateRequestProcessorFactory { +private static final String INPUT_FIELD_PARAM = "inputField"; +private static final String OUTPUT_FIELD_PARAM = "outputField"; +private static final String MODEL_NAME = "model"; + +private String inputField; +private String outputField; +private String modelName; +private SolrParams params; + + +@Override +public void init(final NamedList args) { +params = args.toSolrParams(); +RequiredSolrParams required = params.required(); +inputField = required.get(INPUT_FIELD_PARAM); +outputField = required.get(OUTPUT_FIELD_PARAM); +modelName = required.get(MODEL_NAME); +} + +@Override +public UpdateRequestProcessor getInstance(SolrQueryRequest req, SolrQueryResponse rsp, UpdateRequestProcessor next) { +IndexSchema latestSchema = req.getCore().getLatestSchema(); +if(!latestSchema.hasExplicitField(inputField)){ +throw new SolrException(SolrException.ErrorCode.SERVER_ERROR, "undefined field: \"" + inputField + "\""); +} +if(!latestSchema.hasExplicitField(outputField)){ +throw new SolrException(SolrException.ErrorCode.SERVER_ERROR, "undefined field: \"" + outputField + "\""); +} + +final SchemaField outputFieldSchema = latestSchema.getField(outputField); +assertIsDenseVectorField(outputFieldSchema); + +ManagedTextToVectorModelStore modelStore = ManagedTextToVectorModelStore.getManagedModelStore(req.getCore()); +SolrTextToVectorModel textToVector = modelStore.getModel(modelName); Review Comment: I have the same assumption, I debugged it a few times and the manage resource keeps track of models in a map so should work as expected. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For additional commands, e-mail: issues-h...@solr.apache.org
Re: [PR] SOLR-17632: Text to Vector Update Request Processor [solr]
dsmiley commented on code in PR #3151: URL: https://github.com/apache/solr/pull/3151#discussion_r194470 ## solr/modules/llm/src/java/org/apache/solr/llm/textvectorisation/update/processor/TextToVectorUpdateProcessorFactory.java: ## @@ -0,0 +1,105 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.solr.llm.textvectorisation.update.processor; + +import org.apache.solr.common.SolrException; +import org.apache.solr.common.params.RequiredSolrParams; +import org.apache.solr.common.params.SolrParams; +import org.apache.solr.common.util.NamedList; +import org.apache.solr.llm.textvectorisation.model.SolrTextToVectorModel; +import org.apache.solr.llm.textvectorisation.store.rest.ManagedTextToVectorModelStore; +import org.apache.solr.request.SolrQueryRequest; +import org.apache.solr.response.SolrQueryResponse; +import org.apache.solr.schema.DenseVectorField; +import org.apache.solr.schema.FieldType; +import org.apache.solr.schema.IndexSchema; +import org.apache.solr.schema.SchemaField; +import org.apache.solr.update.processor.UpdateRequestProcessor; +import org.apache.solr.update.processor.UpdateRequestProcessorFactory; + +/** + * This class implements an UpdateProcessorFactory for the Text To Vector Update Processor. + */ +public class TextToVectorUpdateProcessorFactory extends UpdateRequestProcessorFactory { +private static final String INPUT_FIELD_PARAM = "inputField"; +private static final String OUTPUT_FIELD_PARAM = "outputField"; +private static final String MODEL_NAME = "model"; + +private String inputField; +private String outputField; +private String modelName; +private SolrParams params; + + +@Override +public void init(final NamedList args) { +params = args.toSolrParams(); +RequiredSolrParams required = params.required(); +inputField = required.get(INPUT_FIELD_PARAM); +outputField = required.get(OUTPUT_FIELD_PARAM); +modelName = required.get(MODEL_NAME); +} + +@Override +public UpdateRequestProcessor getInstance(SolrQueryRequest req, SolrQueryResponse rsp, UpdateRequestProcessor next) { +IndexSchema latestSchema = req.getCore().getLatestSchema(); +if(!latestSchema.hasExplicitField(inputField)){ +throw new SolrException(SolrException.ErrorCode.SERVER_ERROR, "undefined field: \"" + inputField + "\""); +} +if(!latestSchema.hasExplicitField(outputField)){ +throw new SolrException(SolrException.ErrorCode.SERVER_ERROR, "undefined field: \"" + outputField + "\""); +} + +final SchemaField outputFieldSchema = latestSchema.getField(outputField); +assertIsDenseVectorField(outputFieldSchema); + +ManagedTextToVectorModelStore modelStore = ManagedTextToVectorModelStore.getManagedModelStore(req.getCore()); +SolrTextToVectorModel textToVector = modelStore.getModel(modelName); Review Comment: I presume looking up the model is fast/cached if it already exists? ## solr/modules/llm/src/test/org/apache/solr/llm/textvectorisation/search/TextToVectorQParserTest.java: ## @@ -29,6 +30,11 @@ public static void init() throws Exception { loadModel("dummy-model.json"); } + @AfterClass + public static void cleanup() throws Exception { +afterTest(); + } Review Comment: weird; why? Weird to see a test cleanup also be a suite cleanup. (same for TestModelManager) ## solr/modules/llm/src/test/org/apache/solr/llm/textvectorisation/update/processor/TextToVectorUpdateProcessorFactoryTest.java: ## @@ -0,0 +1,130 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed unde
Re: [PR] SOLR-17632: Text to Vector Update Request Processor [solr]
alessandrobenedetti commented on PR #3151: URL: https://github.com/apache/solr/pull/3151#issuecomment-2639773032 Added one iteration of polishing, should have addressed @cpoerschke concerns on vectorisation failures (I took inspiration from the lang detect update processor). I also removed the additional test solr config files addressing @dsmiley concerns. I'm still puzzled by the testing errors I get (the before/after problems I mentioned in the first comment), any help there would be beneficial. I think we made progress, I;ll wait some other iterations and then work on the documentation. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For additional commands, e-mail: issues-h...@solr.apache.org
Re: [PR] SOLR-17632: Text to Vector Update Request Processor [solr]
alessandrobenedetti commented on code in PR #3151: URL: https://github.com/apache/solr/pull/3151#discussion_r1943637205 ## solr/modules/llm/src/test/org/apache/solr/llm/texttovector/update/processor/TextToVectorUpdateProcessorTest.java: ## @@ -0,0 +1,117 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.solr.llm.texttovector.update.processor; + +import org.apache.solr.client.solrj.SolrQuery; +import org.apache.solr.llm.TestLlmBase; +import org.apache.solr.llm.texttovector.store.rest.ManagedTextToVectorModelStore; +import org.junit.BeforeClass; +import org.junit.Test; + + +public class TextToVectorUpdateProcessorTest extends TestLlmBase { + +@BeforeClass +public static void init() throws Exception { +setupTest("solrconfig-llm-indexing.xml", "schema.xml", false, false); + +} + +@Test +public void processAdd_inputField_shouldVectoriseInputField() +throws Exception { +loadModel("dummy-model.json"); +assertU(adoc("id", "99", "_text_", "Vegeta is the saiyan prince.")); +assertU(adoc("id", "98", "_text_", "Vegeta is the saiyan prince.")); +assertU(commit()); + +final String solrQuery = "*:*"; +final SolrQuery query = new SolrQuery(); +query.setQuery(solrQuery); +query.add("fl", "id,vector"); + +assertJQ( +"/query" + query.toQueryString(), +"/response/numFound==2]", +"/response/docs/[0]/id=='99'", +"/response/docs/[0]/vector==[1.0, 2.0, 3.0, 4.0]", +"/response/docs/[1]/id=='98'", +"/response/docs/[1]/vector==[1.0, 2.0, 3.0, 4.0]"); + +restTestHarness.delete(ManagedTextToVectorModelStore.REST_END_POINT + "/dummy-1"); +} + +/* +This test looks for the 'dummy-1' model, but such model is not loaded, the model store is empty, so the update fails + */ +@Test +public void processAdd_modelNotFound_shouldRaiseException() { +assertFailedU("This update should fail but actually succeeded", adoc("id", "99", "_text_", "Vegeta is the saiyan prince.")); + +checkUpdateU(adoc("id", "99", "_text_", "Vegeta is the saiyan prince."), +"/response/lst[@name='error']/str[@name='msg']=\"The model requested 'dummy-1' can't be found in the store: /schema/text-to-vector-model-store\"", +"/response/lst[@name='error']/int[@name='code']='400'"); +} + +@Test +public void processAdd_emptyInputField_shouldLogAndIndexWithNoVector() throws Exception { +loadModel("dummy-model.json"); +assertU(adoc("id", "99", "_text_", "")); +assertU(adoc("id", "98", "_text_", "Vegeta is the saiyan prince.")); +assertU(commit()); + +final String solrQuery = "*:*"; +final SolrQuery query = new SolrQuery(); +query.setQuery(solrQuery); +query.add("fl", "id,vector"); + +assertJQ( +"/query" + query.toQueryString(), +"/response/numFound==2]", +"/response/docs/[0]/id=='99'", +"!/response/docs/[0]/vector==", //no vector field for the document 99 Review Comment: it took an afternoon almost to find that, it deserved a comment :) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For additional commands, e-mail: issues-h...@solr.apache.org
Re: [PR] SOLR-17632: Text to Vector Update Request Processor [solr]
alessandrobenedetti commented on code in PR #3151: URL: https://github.com/apache/solr/pull/3151#discussion_r1943636345 ## solr/modules/llm/src/test/org/apache/solr/llm/texttovector/update/processor/TextToVectorUpdateProcessorTest.java: ## @@ -0,0 +1,117 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.solr.llm.texttovector.update.processor; + +import org.apache.solr.client.solrj.SolrQuery; +import org.apache.solr.llm.TestLlmBase; +import org.apache.solr.llm.texttovector.store.rest.ManagedTextToVectorModelStore; +import org.junit.BeforeClass; +import org.junit.Test; + + +public class TextToVectorUpdateProcessorTest extends TestLlmBase { + +@BeforeClass +public static void init() throws Exception { +setupTest("solrconfig-llm-indexing.xml", "schema.xml", false, false); + +} + +@Test +public void processAdd_inputField_shouldVectoriseInputField() +throws Exception { +loadModel("dummy-model.json"); +assertU(adoc("id", "99", "_text_", "Vegeta is the saiyan prince.")); +assertU(adoc("id", "98", "_text_", "Vegeta is the saiyan prince.")); +assertU(commit()); + +final String solrQuery = "*:*"; +final SolrQuery query = new SolrQuery(); +query.setQuery(solrQuery); +query.add("fl", "id,vector"); + +assertJQ( +"/query" + query.toQueryString(), +"/response/numFound==2]", +"/response/docs/[0]/id=='99'", +"/response/docs/[0]/vector==[1.0, 2.0, 3.0, 4.0]", +"/response/docs/[1]/id=='98'", +"/response/docs/[1]/vector==[1.0, 2.0, 3.0, 4.0]"); + +restTestHarness.delete(ManagedTextToVectorModelStore.REST_END_POINT + "/dummy-1"); Review Comment: it was cleanup, but not all tests need it, so I added it explicitly, added a line comment to make it clearer -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For additional commands, e-mail: issues-h...@solr.apache.org
Re: [PR] SOLR-17632: Text to Vector Update Request Processor [solr]
alessandrobenedetti commented on code in PR #3151: URL: https://github.com/apache/solr/pull/3151#discussion_r1943633074 ## solr/modules/llm/src/test/org/apache/solr/llm/texttovector/update/processor/TextToVectorUpdateProcessorFactoryTest.java: ## @@ -0,0 +1,129 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.solr.llm.texttovector.update.processor; + +import org.apache.solr.common.SolrException; +import org.apache.solr.common.params.MultiMapSolrParams; +import org.apache.solr.common.params.SolrParams; +import org.apache.solr.common.util.NamedList; +import org.apache.solr.llm.TestLlmBase; +import org.apache.solr.request.SolrQueryRequestBase; +import org.junit.AfterClass; +import org.junit.BeforeClass; +import org.junit.Test; + +import java.util.HashMap; +import java.util.Map; + + +public class TextToVectorUpdateProcessorFactoryTest extends TestLlmBase { + private TextToVectorUpdateProcessorFactory factoryToTest = + new TextToVectorUpdateProcessorFactory(); + private NamedList args = new NamedList<>(); + + @BeforeClass + public static void initArgs() throws Exception { +setupTest("solrconfig-llm.xml", "schema.xml", false, false); + } + + @AfterClass + public static void after() throws Exception { +afterTest(); + } + + @Test + public void init_fullArgs_shouldInitFullClassificationParams() { +args.add("inputField", "_text_"); +args.add("outputField", "vector"); +args.add("model", "model1"); +factoryToTest.init(args); + +assertEquals("_text_", factoryToTest.getInputField()); +assertEquals("vector", factoryToTest.getOutputField()); +assertEquals("model1", factoryToTest.getModelName()); + } + + @Test + public void init_nullInputField_shouldThrowExceptionWithDetailedMessage() { +args.add("outputField", "vector"); +args.add("model", "model1"); + +SolrException e = assertThrows(SolrException.class, () -> factoryToTest.init(args)); +assertEquals("Text to Vector UpdateProcessor 'inputField' can not be null", e.getMessage()); + } + + @Test + public void init_notExistentInputField_shouldThrowExceptionWithDetailedMessage() throws Exception { +args.add("inputField", "notExistentInput"); +args.add("outputField", "vector"); +args.add("model", "model1"); + +Map params = new HashMap<>(); +MultiMapSolrParams mmparams = new MultiMapSolrParams(params); +SolrQueryRequestBase req = new SolrQueryRequestBase(solrClientTestRule.getCoreContainer().getCore("collection1"), (SolrParams) mmparams) {}; Review Comment: I admit I don't know, I'm not that java savvy, I suspect it has to do with instatiating a subclass of an abstract class? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For additional commands, e-mail: issues-h...@solr.apache.org
Re: [PR] SOLR-17632: Text to Vector Update Request Processor [solr]
alessandrobenedetti commented on code in PR #3151: URL: https://github.com/apache/solr/pull/3151#discussion_r1943628396 ## solr/modules/llm/src/java/org/apache/solr/llm/texttovector/update/processor/TextToVectorUpdateProcessor.java: ## @@ -0,0 +1,100 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.solr.llm.texttovector.update.processor; + +import org.apache.solr.common.SolrException; +import org.apache.solr.common.SolrInputDocument; +import org.apache.solr.common.SolrInputField; +import org.apache.solr.llm.texttovector.model.SolrTextToVectorModel; +import org.apache.solr.llm.texttovector.store.rest.ManagedTextToVectorModelStore; +import org.apache.solr.request.SolrQueryRequest; +import org.apache.solr.update.AddUpdateCommand; +import org.apache.solr.update.processor.UpdateRequestProcessor; +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +import java.io.IOException; +import java.lang.invoke.MethodHandles; +import java.util.ArrayList; +import java.util.List; + + +class TextToVectorUpdateProcessor extends UpdateRequestProcessor { +private static final Logger log = LoggerFactory.getLogger(MethodHandles.lookup().lookupClass()); + +private final String inputField; +private final String outputField; +private final String model; +private SolrTextToVectorModel textToVector; +private ManagedTextToVectorModelStore modelStore = null; + +public TextToVectorUpdateProcessor( +String inputField, +String outputField, +String model, +SolrQueryRequest req, +UpdateRequestProcessor next) { +super(next); +this.inputField = inputField; +this.outputField = outputField; +this.model = model; +this.modelStore = ManagedTextToVectorModelStore.getManagedModelStore(req.getCore()); +} + +/** + * @param cmd the update command in input containing the Document to process + * @throws IOException If there is a low-level I/O error + */ +@Override +public void processAdd(AddUpdateCommand cmd) throws IOException { +this.textToVector = modelStore.getModel(model); +if (textToVector == null) { +throw new SolrException( +SolrException.ErrorCode.BAD_REQUEST, +"The model requested '" ++ model ++ "' can't be found in the store: " ++ ManagedTextToVectorModelStore.REST_END_POINT); +} + +SolrInputDocument doc = cmd.getSolrInputDocument(); +SolrInputField inputFieldContent = doc.get(inputField); +if (!isNullOrEmpty(inputFieldContent, doc, inputField)) { +String textToVectorise = inputFieldContent.getValue().toString();//add null checks and +float[] vector = textToVector.vectorise(textToVectorise); +List vectorAsList = new ArrayList(vector.length); +for (float f : vector) { +vectorAsList.add(f); +} +doc.addField(outputField, vectorAsList); +} +super.processAdd(cmd); +} + +protected boolean isNullOrEmpty(SolrInputField inputFieldContent, SolrInputDocument doc, String fieldName) { Review Comment: mmm I see your point, better if we just log a warning say "vectorisation failed", with the reason "null or empty source field" ? I suspect that silent failure would be equally problematic to understand why there are no vectors? (I also just discovered that the 'vectorise' method could throw runtime exception) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For additional commands, e-mail: issues-h...@solr.apache.org
Re: [PR] SOLR-17632: Text to Vector Update Request Processor [solr]
alessandrobenedetti commented on code in PR #3151: URL: https://github.com/apache/solr/pull/3151#discussion_r1943616272 ## solr/modules/llm/src/java/org/apache/solr/llm/texttovector/update/processor/TextToVectorUpdateProcessor.java: ## @@ -0,0 +1,100 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.solr.llm.texttovector.update.processor; Review Comment: I agree, texttovector is horribly unreadable, maybe 'textvectorisation' ? adding it in the coming commit -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For additional commands, e-mail: issues-h...@solr.apache.org
Re: [PR] SOLR-17632: Text to Vector Update Request Processor [solr]
alessandrobenedetti commented on code in PR #3151: URL: https://github.com/apache/solr/pull/3151#discussion_r1943614530 ## solr/modules/llm/src/test-files/solr/collection1/conf/solrconfig-llm-indexing-notDenseVectorField.xml: ## Review Comment: I could, but the reason I added it is that I struggled to find testing methods such as org.apache.solr.util.RestTestBase#assertU(java.lang.String) that takes the chain as a parameter. So I added as the default and I could test it. I would not want to be the default when indexing docs for the query time test. If you have any suggestion I'm open to changes -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For additional commands, e-mail: issues-h...@solr.apache.org
Re: [PR] SOLR-17632: Text to Vector Update Request Processor [solr]
alessandrobenedetti commented on code in PR #3151: URL: https://github.com/apache/solr/pull/3151#discussion_r1943608528 ## solr/modules/llm/src/java/org/apache/solr/llm/texttovector/update/processor/TextToVectorUpdateProcessorFactory.java: ## @@ -0,0 +1,97 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.solr.llm.texttovector.update.processor; + +import org.apache.solr.common.SolrException; +import org.apache.solr.common.params.SolrParams; +import org.apache.solr.common.util.NamedList; +import org.apache.solr.request.SolrQueryRequest; +import org.apache.solr.response.SolrQueryResponse; +import org.apache.solr.schema.DenseVectorField; +import org.apache.solr.schema.FieldType; +import org.apache.solr.schema.SchemaField; +import org.apache.solr.update.processor.UpdateRequestProcessor; +import org.apache.solr.update.processor.UpdateRequestProcessorFactory; + +/** + * This class implements an UpdateProcessorFactory for the Text To Vector Update Processor. + */ +public class TextToVectorUpdateProcessorFactory extends UpdateRequestProcessorFactory { +private static final String INPUT_FIELD_PARAM = "inputField"; +private static final String OUTPUT_FIELD_PARAM = "outputField"; +private static final String MODEL_NAME = "model"; + +String inputField; +String outputField; +String modelName; +SolrParams params; + + +@Override +public void init(final NamedList args) { +if (args != null) { +params = args.toSolrParams(); +inputField = params.get(INPUT_FIELD_PARAM); +checkNotNull(INPUT_FIELD_PARAM, inputField); + +outputField = params.get(OUTPUT_FIELD_PARAM); +checkNotNull(OUTPUT_FIELD_PARAM, outputField); + +modelName = params.get(MODEL_NAME); +checkNotNull(MODEL_NAME, modelName); +} +} + +private void checkNotNull(String paramName, Object param) { +if (param == null) { +throw new SolrException( +SolrException.ErrorCode.SERVER_ERROR, +"Text to Vector UpdateProcessor '" + paramName + "' can not be null"); +} +} + +@Override +public UpdateRequestProcessor getInstance(SolrQueryRequest req, SolrQueryResponse rsp, UpdateRequestProcessor next) { +req.getCore().getLatestSchema().getField(inputField); Review Comment: it checks that 'inputField' is defined in the schema. With the latest commit I changed it to make it more explicit but I am open to suggestions -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For additional commands, e-mail: issues-h...@solr.apache.org
Re: [PR] SOLR-17632: Text to Vector Update Request Processor [solr]
alessandrobenedetti commented on code in PR #3151: URL: https://github.com/apache/solr/pull/3151#discussion_r1943595213 ## solr/modules/llm/src/java/org/apache/solr/llm/texttovector/update/processor/TextToVectorUpdateProcessorFactory.java: ## @@ -0,0 +1,97 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.solr.llm.texttovector.update.processor; + +import org.apache.solr.common.SolrException; +import org.apache.solr.common.params.SolrParams; +import org.apache.solr.common.util.NamedList; +import org.apache.solr.request.SolrQueryRequest; +import org.apache.solr.response.SolrQueryResponse; +import org.apache.solr.schema.DenseVectorField; +import org.apache.solr.schema.FieldType; +import org.apache.solr.schema.SchemaField; +import org.apache.solr.update.processor.UpdateRequestProcessor; +import org.apache.solr.update.processor.UpdateRequestProcessorFactory; + +/** + * This class implements an UpdateProcessorFactory for the Text To Vector Update Processor. + */ +public class TextToVectorUpdateProcessorFactory extends UpdateRequestProcessorFactory { +private static final String INPUT_FIELD_PARAM = "inputField"; +private static final String OUTPUT_FIELD_PARAM = "outputField"; +private static final String MODEL_NAME = "model"; + +String inputField; +String outputField; +String modelName; +SolrParams params; + + +@Override +public void init(final NamedList args) { +if (args != null) { +params = args.toSolrParams(); +inputField = params.get(INPUT_FIELD_PARAM); +checkNotNull(INPUT_FIELD_PARAM, inputField); + +outputField = params.get(OUTPUT_FIELD_PARAM); +checkNotNull(OUTPUT_FIELD_PARAM, outputField); + +modelName = params.get(MODEL_NAME); +checkNotNull(MODEL_NAME, modelName); Review Comment: much cleaner, thanks David! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For additional commands, e-mail: issues-h...@solr.apache.org
Re: [PR] SOLR-17632: Text to Vector Update Request Processor [solr]
alessandrobenedetti commented on code in PR #3151: URL: https://github.com/apache/solr/pull/3151#discussion_r1943577028 ## solr/modules/llm/src/java/org/apache/solr/llm/texttovector/update/processor/TextToVectorUpdateProcessorFactory.java: ## @@ -0,0 +1,97 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.solr.llm.texttovector.update.processor; + +import org.apache.solr.common.SolrException; +import org.apache.solr.common.params.SolrParams; +import org.apache.solr.common.util.NamedList; +import org.apache.solr.request.SolrQueryRequest; +import org.apache.solr.response.SolrQueryResponse; +import org.apache.solr.schema.DenseVectorField; +import org.apache.solr.schema.FieldType; +import org.apache.solr.schema.SchemaField; +import org.apache.solr.update.processor.UpdateRequestProcessor; +import org.apache.solr.update.processor.UpdateRequestProcessorFactory; + +/** + * This class implements an UpdateProcessorFactory for the Text To Vector Update Processor. + */ +public class TextToVectorUpdateProcessorFactory extends UpdateRequestProcessorFactory { +private static final String INPUT_FIELD_PARAM = "inputField"; +private static final String OUTPUT_FIELD_PARAM = "outputField"; +private static final String MODEL_NAME = "model"; + +String inputField; +String outputField; +String modelName; +SolrParams params; + + +@Override +public void init(final NamedList args) { +if (args != null) { Review Comment: my bad, I took inspiration from an old factory, I'll remove this useless check in the next commit! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For additional commands, e-mail: issues-h...@solr.apache.org
Re: [PR] SOLR-17632: Text to Vector Update Request Processor [solr]
alessandrobenedetti commented on code in PR #3151: URL: https://github.com/apache/solr/pull/3151#discussion_r1943565203 ## solr/modules/llm/src/java/org/apache/solr/llm/texttovector/update/processor/TextToVectorUpdateProcessor.java: ## @@ -0,0 +1,100 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.solr.llm.texttovector.update.processor; + +import org.apache.solr.common.SolrException; +import org.apache.solr.common.SolrInputDocument; +import org.apache.solr.common.SolrInputField; +import org.apache.solr.llm.texttovector.model.SolrTextToVectorModel; +import org.apache.solr.llm.texttovector.store.rest.ManagedTextToVectorModelStore; +import org.apache.solr.request.SolrQueryRequest; +import org.apache.solr.update.AddUpdateCommand; +import org.apache.solr.update.processor.UpdateRequestProcessor; +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +import java.io.IOException; +import java.lang.invoke.MethodHandles; +import java.util.ArrayList; +import java.util.List; + + +class TextToVectorUpdateProcessor extends UpdateRequestProcessor { +private static final Logger log = LoggerFactory.getLogger(MethodHandles.lookup().lookupClass()); + +private final String inputField; +private final String outputField; +private final String model; +private SolrTextToVectorModel textToVector; +private ManagedTextToVectorModelStore modelStore = null; + +public TextToVectorUpdateProcessor( +String inputField, +String outputField, +String model, +SolrQueryRequest req, +UpdateRequestProcessor next) { +super(next); +this.inputField = inputField; +this.outputField = outputField; +this.model = model; +this.modelStore = ManagedTextToVectorModelStore.getManagedModelStore(req.getCore()); +} + +/** + * @param cmd the update command in input containing the Document to process + * @throws IOException If there is a low-level I/O error + */ +@Override +public void processAdd(AddUpdateCommand cmd) throws IOException { +this.textToVector = modelStore.getModel(model); +if (textToVector == null) { +throw new SolrException( +SolrException.ErrorCode.BAD_REQUEST, +"The model requested '" ++ model ++ "' can't be found in the store: " ++ ManagedTextToVectorModelStore.REST_END_POINT); +} + +SolrInputDocument doc = cmd.getSolrInputDocument(); +SolrInputField inputFieldContent = doc.get(inputField); +if (!isNullOrEmpty(inputFieldContent, doc, inputField)) { +String textToVectorise = inputFieldContent.getValue().toString();//add null checks and +float[] vector = textToVector.vectorise(textToVectorise); Review Comment: 1) @cpoerschke : I double checked and the langchain4j library 'embed' method (that's used in our 'vectorise' method) returns a RuntimeException . That's bad as it was not detected without investigating the internals of the code (I hate these practices). I'll give it a thought, any suggestion is welcome! 2) @epugh : given that 'update.chain' is a parameter, if you configure a chain with no vector enrichment and a chain with vector enrichment, what prevents you from first index using the 'no vectors' chain and then slowly updating the index with atomic updates that add vectors (using the vector-chain)? We should double check and add to the documentation once we consolidate the code, what do you think? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For additional commands, e-mail: issues-h...@solr.apache.org
Re: [PR] SOLR-17632: Text to Vector Update Request Processor [solr]
alessandrobenedetti commented on PR #3151: URL: https://github.com/apache/solr/pull/3151#issuecomment-2637867724 > I wanted to follow-up on my feedback to the LLM module concerning use of the word "embedding". I first tried to say that I was not familiar with the word, and your response was to remove it (completely?) from the module. If "embedding" is an appropriate word then use it. The documentation should reference it in the ref guide, even if just an "AKA". Embedding is widely used in the field, but it's a bit ambiguous and to be honest, I'm with you in not using any term that can cause confusion. Do you mean I added embedding in here somewhere? If that's the case, It's a mistake, point it to me and I'll remove it! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For additional commands, e-mail: issues-h...@solr.apache.org
Re: [PR] SOLR-17632: Text to Vector Update Request Processor [solr]
alessandrobenedetti commented on code in PR #3151: URL: https://github.com/apache/solr/pull/3151#discussion_r1943565203 ## solr/modules/llm/src/java/org/apache/solr/llm/texttovector/update/processor/TextToVectorUpdateProcessor.java: ## @@ -0,0 +1,100 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.solr.llm.texttovector.update.processor; + +import org.apache.solr.common.SolrException; +import org.apache.solr.common.SolrInputDocument; +import org.apache.solr.common.SolrInputField; +import org.apache.solr.llm.texttovector.model.SolrTextToVectorModel; +import org.apache.solr.llm.texttovector.store.rest.ManagedTextToVectorModelStore; +import org.apache.solr.request.SolrQueryRequest; +import org.apache.solr.update.AddUpdateCommand; +import org.apache.solr.update.processor.UpdateRequestProcessor; +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +import java.io.IOException; +import java.lang.invoke.MethodHandles; +import java.util.ArrayList; +import java.util.List; + + +class TextToVectorUpdateProcessor extends UpdateRequestProcessor { +private static final Logger log = LoggerFactory.getLogger(MethodHandles.lookup().lookupClass()); + +private final String inputField; +private final String outputField; +private final String model; +private SolrTextToVectorModel textToVector; +private ManagedTextToVectorModelStore modelStore = null; + +public TextToVectorUpdateProcessor( +String inputField, +String outputField, +String model, +SolrQueryRequest req, +UpdateRequestProcessor next) { +super(next); +this.inputField = inputField; +this.outputField = outputField; +this.model = model; +this.modelStore = ManagedTextToVectorModelStore.getManagedModelStore(req.getCore()); +} + +/** + * @param cmd the update command in input containing the Document to process + * @throws IOException If there is a low-level I/O error + */ +@Override +public void processAdd(AddUpdateCommand cmd) throws IOException { +this.textToVector = modelStore.getModel(model); +if (textToVector == null) { +throw new SolrException( +SolrException.ErrorCode.BAD_REQUEST, +"The model requested '" ++ model ++ "' can't be found in the store: " ++ ManagedTextToVectorModelStore.REST_END_POINT); +} + +SolrInputDocument doc = cmd.getSolrInputDocument(); +SolrInputField inputFieldContent = doc.get(inputField); +if (!isNullOrEmpty(inputFieldContent, doc, inputField)) { +String textToVectorise = inputFieldContent.getValue().toString();//add null checks and +float[] vector = textToVector.vectorise(textToVectorise); Review Comment: 1) @cpoerschke : I double checked and the langchain4j library 'embed' method (that's used in our 'vecctorise' method), doesn't return any exception, but I gree we should investigate what happens if that request fails (my best guess is we get an empty vector or null, I'll add that to tests) 2) @epugh : given that 'update.chain' is a parameter, if you configure a chain with no vector enrichment and a chain with vector enrichment, what prevents you from first index using the 'no vectors' chain and then slowly updating the index with atomic updates that add vectors (using the vector-chain)? We should double check and add to the documentation once we consolidate the code, what do you think? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For additional commands, e-mail: issues-h...@solr.apache.org
Re: [PR] SOLR-17632: Text to Vector Update Request Processor [solr]
alessandrobenedetti commented on code in PR #3151: URL: https://github.com/apache/solr/pull/3151#discussion_r1943565203 ## solr/modules/llm/src/java/org/apache/solr/llm/texttovector/update/processor/TextToVectorUpdateProcessor.java: ## @@ -0,0 +1,100 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.solr.llm.texttovector.update.processor; + +import org.apache.solr.common.SolrException; +import org.apache.solr.common.SolrInputDocument; +import org.apache.solr.common.SolrInputField; +import org.apache.solr.llm.texttovector.model.SolrTextToVectorModel; +import org.apache.solr.llm.texttovector.store.rest.ManagedTextToVectorModelStore; +import org.apache.solr.request.SolrQueryRequest; +import org.apache.solr.update.AddUpdateCommand; +import org.apache.solr.update.processor.UpdateRequestProcessor; +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +import java.io.IOException; +import java.lang.invoke.MethodHandles; +import java.util.ArrayList; +import java.util.List; + + +class TextToVectorUpdateProcessor extends UpdateRequestProcessor { +private static final Logger log = LoggerFactory.getLogger(MethodHandles.lookup().lookupClass()); + +private final String inputField; +private final String outputField; +private final String model; +private SolrTextToVectorModel textToVector; +private ManagedTextToVectorModelStore modelStore = null; + +public TextToVectorUpdateProcessor( +String inputField, +String outputField, +String model, +SolrQueryRequest req, +UpdateRequestProcessor next) { +super(next); +this.inputField = inputField; +this.outputField = outputField; +this.model = model; +this.modelStore = ManagedTextToVectorModelStore.getManagedModelStore(req.getCore()); +} + +/** + * @param cmd the update command in input containing the Document to process + * @throws IOException If there is a low-level I/O error + */ +@Override +public void processAdd(AddUpdateCommand cmd) throws IOException { +this.textToVector = modelStore.getModel(model); +if (textToVector == null) { +throw new SolrException( +SolrException.ErrorCode.BAD_REQUEST, +"The model requested '" ++ model ++ "' can't be found in the store: " ++ ManagedTextToVectorModelStore.REST_END_POINT); +} + +SolrInputDocument doc = cmd.getSolrInputDocument(); +SolrInputField inputFieldContent = doc.get(inputField); +if (!isNullOrEmpty(inputFieldContent, doc, inputField)) { +String textToVectorise = inputFieldContent.getValue().toString();//add null checks and +float[] vector = textToVector.vectorise(textToVectorise); Review Comment: 1) @cpoerschke : I double checked and the langchain4j library 'embed' method (that's used in our 'vectorise' method), doesn't return any exception, but I gree we should investigate what happens if that request fails (my best guess is we get an empty vector or null, I'll add that to tests) 2) @epugh : given that 'update.chain' is a parameter, if you configure a chain with no vector enrichment and a chain with vector enrichment, what prevents you from first index using the 'no vectors' chain and then slowly updating the index with atomic updates that add vectors (using the vector-chain)? We should double check and add to the documentation once we consolidate the code, what do you think? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For additional commands, e-mail: issues-h...@solr.apache.org
Re: [PR] SOLR-17632: Text to Vector Update Request Processor [solr]
alessandrobenedetti commented on code in PR #3151: URL: https://github.com/apache/solr/pull/3151#discussion_r1943553277 ## solr/modules/llm/src/java/org/apache/solr/llm/texttovector/update/processor/TextToVectorUpdateProcessor.java: ## @@ -0,0 +1,100 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.solr.llm.texttovector.update.processor; + +import org.apache.solr.common.SolrException; +import org.apache.solr.common.SolrInputDocument; +import org.apache.solr.common.SolrInputField; +import org.apache.solr.llm.texttovector.model.SolrTextToVectorModel; +import org.apache.solr.llm.texttovector.store.rest.ManagedTextToVectorModelStore; +import org.apache.solr.request.SolrQueryRequest; +import org.apache.solr.update.AddUpdateCommand; +import org.apache.solr.update.processor.UpdateRequestProcessor; +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +import java.io.IOException; +import java.lang.invoke.MethodHandles; +import java.util.ArrayList; +import java.util.List; + + +class TextToVectorUpdateProcessor extends UpdateRequestProcessor { +private static final Logger log = LoggerFactory.getLogger(MethodHandles.lookup().lookupClass()); + +private final String inputField; +private final String outputField; +private final String model; +private SolrTextToVectorModel textToVector; +private ManagedTextToVectorModelStore modelStore = null; + +public TextToVectorUpdateProcessor( +String inputField, +String outputField, +String model, +SolrQueryRequest req, +UpdateRequestProcessor next) { +super(next); +this.inputField = inputField; +this.outputField = outputField; +this.model = model; +this.modelStore = ManagedTextToVectorModelStore.getManagedModelStore(req.getCore()); +} + +/** + * @param cmd the update command in input containing the Document to process + * @throws IOException If there is a low-level I/O error + */ +@Override +public void processAdd(AddUpdateCommand cmd) throws IOException { +this.textToVector = modelStore.getModel(model); Review Comment: I was debugging the flow to have a better understanding of the lifecycle of an update request processor. From what I see from the test, the factory instantiates a new update request processor every time a new update request is received. I think it's ok to keep it a class member, but let me see if I can move the instantiation to the factory. Ideally I wanted that to happen when the factory is initiate but It seems that the update request processor factory is not compatible with resource loading (as far as I debugged and checked) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For additional commands, e-mail: issues-h...@solr.apache.org
Re: [PR] SOLR-17632: Text to Vector Update Request Processor [solr]
alessandrobenedetti commented on code in PR #3151: URL: https://github.com/apache/solr/pull/3151#discussion_r1943488899 ## solr/modules/llm/src/java/org/apache/solr/llm/texttovector/update/processor/TextToVectorUpdateProcessorFactory.java: ## @@ -0,0 +1,97 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.solr.llm.texttovector.update.processor; + +import org.apache.solr.common.SolrException; +import org.apache.solr.common.params.SolrParams; +import org.apache.solr.common.util.NamedList; +import org.apache.solr.request.SolrQueryRequest; +import org.apache.solr.response.SolrQueryResponse; +import org.apache.solr.schema.DenseVectorField; +import org.apache.solr.schema.FieldType; +import org.apache.solr.schema.SchemaField; +import org.apache.solr.update.processor.UpdateRequestProcessor; +import org.apache.solr.update.processor.UpdateRequestProcessorFactory; + +/** + * This class implements an UpdateProcessorFactory for the Text To Vector Update Processor. + */ +public class TextToVectorUpdateProcessorFactory extends UpdateRequestProcessorFactory { +private static final String INPUT_FIELD_PARAM = "inputField"; +private static final String OUTPUT_FIELD_PARAM = "outputField"; +private static final String MODEL_NAME = "model"; + +String inputField; +String outputField; +String modelName; +SolrParams params; Review Comment: Done! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For additional commands, e-mail: issues-h...@solr.apache.org
Re: [PR] SOLR-17632: Text to Vector Update Request Processor [solr]
dsmiley commented on code in PR #3151: URL: https://github.com/apache/solr/pull/3151#discussion_r1943539435 ## solr/test-framework/src/java/org/apache/solr/util/RestTestBase.java: ## @@ -88,13 +88,33 @@ private static void checkUpdateU(String message, String update, boolean shouldSu if (response != null) fail(m + "update was not successful: " + response); } else { String response = restTestHarness.validateErrorUpdate(update); -if (response != null) fail(m + "update succeeded, but should have failed: " + response); +if (response == null) fail(m + "update succeeded, but should have failed: " + response); } } catch (SAXException e) { throw new RuntimeException("Invalid XML", e); } } + public static void checkUpdateU(String update, String... tests) { Review Comment: At the moment, RestTestBase is common to basically any test using a "REST-based model store"; which the LLM stuff recently added a new variant of and hence RestTestBase is used. RestTestBase is used a lot. Preferrably we wouldn't depend too much on our class hierarchy to accomplish re-usable things. But there's no realistic action to take right now. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For additional commands, e-mail: issues-h...@solr.apache.org
Re: [PR] SOLR-17632: Text to Vector Update Request Processor [solr]
dsmiley commented on code in PR #3151: URL: https://github.com/apache/solr/pull/3151#discussion_r1943529534 ## solr/modules/llm/src/test/org/apache/solr/llm/texttovector/update/processor/TextToVectorUpdateProcessorFactoryTest.java: ## @@ -0,0 +1,129 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.solr.llm.texttovector.update.processor; + +import org.apache.solr.common.SolrException; +import org.apache.solr.common.params.MultiMapSolrParams; +import org.apache.solr.common.params.SolrParams; +import org.apache.solr.common.util.NamedList; +import org.apache.solr.llm.TestLlmBase; +import org.apache.solr.request.SolrQueryRequestBase; +import org.junit.AfterClass; +import org.junit.BeforeClass; +import org.junit.Test; + +import java.util.HashMap; +import java.util.Map; + + +public class TextToVectorUpdateProcessorFactoryTest extends TestLlmBase { + private TextToVectorUpdateProcessorFactory factoryToTest = + new TextToVectorUpdateProcessorFactory(); + private NamedList args = new NamedList<>(); + + @BeforeClass + public static void initArgs() throws Exception { +setupTest("solrconfig-llm.xml", "schema.xml", false, false); + } + + @AfterClass + public static void after() throws Exception { +afterTest(); + } + + @Test + public void init_fullArgs_shouldInitFullClassificationParams() { +args.add("inputField", "_text_"); +args.add("outputField", "vector"); +args.add("model", "model1"); +factoryToTest.init(args); + +assertEquals("_text_", factoryToTest.getInputField()); +assertEquals("vector", factoryToTest.getOutputField()); +assertEquals("model1", factoryToTest.getModelName()); + } + + @Test + public void init_nullInputField_shouldThrowExceptionWithDetailedMessage() { +args.add("outputField", "vector"); +args.add("model", "model1"); + +SolrException e = assertThrows(SolrException.class, () -> factoryToTest.init(args)); +assertEquals("Text to Vector UpdateProcessor 'inputField' can not be null", e.getMessage()); + } + + @Test + public void init_notExistentInputField_shouldThrowExceptionWithDetailedMessage() throws Exception { +args.add("inputField", "notExistentInput"); +args.add("outputField", "vector"); +args.add("model", "model1"); + +Map params = new HashMap<>(); +MultiMapSolrParams mmparams = new MultiMapSolrParams(params); +SolrQueryRequestBase req = new SolrQueryRequestBase(solrClientTestRule.getCoreContainer().getCore("collection1"), (SolrParams) mmparams) {}; Review Comment: It's an anonymous inner class. What's probably throwing you off is that there are no method overrides, which 99% of the time is the point of doing an anonymous inner class. Here it's because SQRB is abstract so he's forced to subclass it in order to use it. I've been thinking of this case recently and I think we should simply make that impl not abstract. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For additional commands, e-mail: issues-h...@solr.apache.org
Re: [PR] SOLR-17632: Text to Vector Update Request Processor [solr]
alessandrobenedetti commented on code in PR #3151: URL: https://github.com/apache/solr/pull/3151#discussion_r1943491138 ## solr/modules/llm/src/java/org/apache/solr/llm/texttovector/update/processor/TextToVectorUpdateProcessor.java: ## @@ -0,0 +1,100 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.solr.llm.texttovector.update.processor; + +import org.apache.solr.common.SolrException; +import org.apache.solr.common.SolrInputDocument; +import org.apache.solr.common.SolrInputField; +import org.apache.solr.llm.texttovector.model.SolrTextToVectorModel; +import org.apache.solr.llm.texttovector.store.rest.ManagedTextToVectorModelStore; +import org.apache.solr.request.SolrQueryRequest; +import org.apache.solr.update.AddUpdateCommand; +import org.apache.solr.update.processor.UpdateRequestProcessor; +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +import java.io.IOException; +import java.lang.invoke.MethodHandles; +import java.util.ArrayList; +import java.util.List; + + +class TextToVectorUpdateProcessor extends UpdateRequestProcessor { +private static final Logger log = LoggerFactory.getLogger(MethodHandles.lookup().lookupClass()); + +private final String inputField; +private final String outputField; +private final String model; +private SolrTextToVectorModel textToVector; +private ManagedTextToVectorModelStore modelStore = null; Review Comment: Sure! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For additional commands, e-mail: issues-h...@solr.apache.org
Re: [PR] SOLR-17632: Text to Vector Update Request Processor [solr]
epugh commented on code in PR #3151: URL: https://github.com/apache/solr/pull/3151#discussion_r1943278670 ## solr/test-framework/src/java/org/apache/solr/util/RestTestBase.java: ## @@ -88,13 +88,33 @@ private static void checkUpdateU(String message, String update, boolean shouldSu if (response != null) fail(m + "update was not successful: " + response); } else { String response = restTestHarness.validateErrorUpdate(update); -if (response != null) fail(m + "update succeeded, but should have failed: " + response); +if (response == null) fail(m + "update succeeded, but should have failed: " + response); } } catch (SAXException e) { throw new RuntimeException("Invalid XML", e); } } + public static void checkUpdateU(String update, String... tests) { Review Comment: Not specific per se to this, but I wish we had a clearer plan about the future of RestTestBase. ARe we embracing it? ## solr/modules/llm/src/java/org/apache/solr/llm/texttovector/update/processor/TextToVectorUpdateProcessor.java: ## @@ -0,0 +1,100 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.solr.llm.texttovector.update.processor; + +import org.apache.solr.common.SolrException; +import org.apache.solr.common.SolrInputDocument; +import org.apache.solr.common.SolrInputField; +import org.apache.solr.llm.texttovector.model.SolrTextToVectorModel; +import org.apache.solr.llm.texttovector.store.rest.ManagedTextToVectorModelStore; +import org.apache.solr.request.SolrQueryRequest; +import org.apache.solr.update.AddUpdateCommand; +import org.apache.solr.update.processor.UpdateRequestProcessor; +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +import java.io.IOException; +import java.lang.invoke.MethodHandles; +import java.util.ArrayList; +import java.util.List; + + +class TextToVectorUpdateProcessor extends UpdateRequestProcessor { +private static final Logger log = LoggerFactory.getLogger(MethodHandles.lookup().lookupClass()); + +private final String inputField; +private final String outputField; +private final String model; +private SolrTextToVectorModel textToVector; +private ManagedTextToVectorModelStore modelStore = null; + +public TextToVectorUpdateProcessor( +String inputField, +String outputField, +String model, +SolrQueryRequest req, +UpdateRequestProcessor next) { +super(next); +this.inputField = inputField; +this.outputField = outputField; +this.model = model; +this.modelStore = ManagedTextToVectorModelStore.getManagedModelStore(req.getCore()); +} + +/** + * @param cmd the update command in input containing the Document to process + * @throws IOException If there is a low-level I/O error + */ +@Override +public void processAdd(AddUpdateCommand cmd) throws IOException { +this.textToVector = modelStore.getModel(model); +if (textToVector == null) { +throw new SolrException( +SolrException.ErrorCode.BAD_REQUEST, +"The model requested '" ++ model ++ "' can't be found in the store: " ++ ManagedTextToVectorModelStore.REST_END_POINT); +} + +SolrInputDocument doc = cmd.getSolrInputDocument(); +SolrInputField inputFieldContent = doc.get(inputField); +if (!isNullOrEmpty(inputFieldContent, doc, inputField)) { +String textToVectorise = inputFieldContent.getValue().toString();//add null checks and +float[] vector = textToVector.vectorise(textToVectorise); Review Comment: I was chatting with @iamsanjay this morning, and I was expounding on the thought that a lot of folks might want to first index the document with just the core text/string/numbers, and then, since enrichment is SLOW, come back with a streaming expression and do things like vectorization, and an atomic update.. that way you pump your data in as fast as possible, and then enrich at your leisure...This model cons
Re: [PR] SOLR-17632: Text to Vector Update Request Processor [solr]
dsmiley commented on code in PR #3151: URL: https://github.com/apache/solr/pull/3151#discussion_r1942178207 ## solr/modules/llm/src/java/org/apache/solr/llm/texttovector/update/processor/TextToVectorUpdateProcessorFactory.java: ## @@ -0,0 +1,97 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.solr.llm.texttovector.update.processor; + +import org.apache.solr.common.SolrException; +import org.apache.solr.common.params.SolrParams; +import org.apache.solr.common.util.NamedList; +import org.apache.solr.request.SolrQueryRequest; +import org.apache.solr.response.SolrQueryResponse; +import org.apache.solr.schema.DenseVectorField; +import org.apache.solr.schema.FieldType; +import org.apache.solr.schema.SchemaField; +import org.apache.solr.update.processor.UpdateRequestProcessor; +import org.apache.solr.update.processor.UpdateRequestProcessorFactory; + +/** + * This class implements an UpdateProcessorFactory for the Text To Vector Update Processor. + */ +public class TextToVectorUpdateProcessorFactory extends UpdateRequestProcessorFactory { +private static final String INPUT_FIELD_PARAM = "inputField"; +private static final String OUTPUT_FIELD_PARAM = "outputField"; +private static final String MODEL_NAME = "model"; + +String inputField; +String outputField; +String modelName; +SolrParams params; + + +@Override +public void init(final NamedList args) { +if (args != null) { Review Comment: args should never be null ## solr/modules/llm/src/java/org/apache/solr/llm/texttovector/update/processor/TextToVectorUpdateProcessorFactory.java: ## @@ -0,0 +1,97 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.solr.llm.texttovector.update.processor; + +import org.apache.solr.common.SolrException; +import org.apache.solr.common.params.SolrParams; +import org.apache.solr.common.util.NamedList; +import org.apache.solr.request.SolrQueryRequest; +import org.apache.solr.response.SolrQueryResponse; +import org.apache.solr.schema.DenseVectorField; +import org.apache.solr.schema.FieldType; +import org.apache.solr.schema.SchemaField; +import org.apache.solr.update.processor.UpdateRequestProcessor; +import org.apache.solr.update.processor.UpdateRequestProcessorFactory; + +/** + * This class implements an UpdateProcessorFactory for the Text To Vector Update Processor. + */ +public class TextToVectorUpdateProcessorFactory extends UpdateRequestProcessorFactory { +private static final String INPUT_FIELD_PARAM = "inputField"; +private static final String OUTPUT_FIELD_PARAM = "outputField"; +private static final String MODEL_NAME = "model"; + +String inputField; +String outputField; +String modelName; +SolrParams params; + + +@Override +public void init(final NamedList args) { +if (args != null) { +params = args.toSolrParams(); +inputField = params.get(INPUT_FIELD_PARAM); +checkNotNull(INPUT_FIELD_PARAM, inputField); + +outputField = params.get(OUTPUT_FIELD_PARAM); +checkNotNull(OUTPUT_FIELD_PARAM, outputField); + +modelName = params.get(MODEL_NAME); +checkNotNull(MODEL_NAME, modelName); +} +} + +private void checkNotNull(String paramName, Object param) { +if (param == null) { +throw new SolrException( +SolrException.Er
Re: [PR] SOLR-17632: Text to Vector Update Request Processor [solr]
cpoerschke commented on code in PR #3151: URL: https://github.com/apache/solr/pull/3151#discussion_r1937732098 ## solr/modules/llm/src/java/org/apache/solr/llm/texttovector/update/processor/TextToVectorUpdateProcessor.java: ## @@ -0,0 +1,100 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.solr.llm.texttovector.update.processor; + +import org.apache.solr.common.SolrException; +import org.apache.solr.common.SolrInputDocument; +import org.apache.solr.common.SolrInputField; +import org.apache.solr.llm.texttovector.model.SolrTextToVectorModel; +import org.apache.solr.llm.texttovector.store.rest.ManagedTextToVectorModelStore; +import org.apache.solr.request.SolrQueryRequest; +import org.apache.solr.update.AddUpdateCommand; +import org.apache.solr.update.processor.UpdateRequestProcessor; +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +import java.io.IOException; +import java.lang.invoke.MethodHandles; +import java.util.ArrayList; +import java.util.List; + + +class TextToVectorUpdateProcessor extends UpdateRequestProcessor { +private static final Logger log = LoggerFactory.getLogger(MethodHandles.lookup().lookupClass()); + +private final String inputField; +private final String outputField; +private final String model; +private SolrTextToVectorModel textToVector; +private ManagedTextToVectorModelStore modelStore = null; + +public TextToVectorUpdateProcessor( +String inputField, +String outputField, +String model, +SolrQueryRequest req, +UpdateRequestProcessor next) { +super(next); +this.inputField = inputField; +this.outputField = outputField; +this.model = model; +this.modelStore = ManagedTextToVectorModelStore.getManagedModelStore(req.getCore()); +} + +/** + * @param cmd the update command in input containing the Document to process + * @throws IOException If there is a low-level I/O error + */ +@Override +public void processAdd(AddUpdateCommand cmd) throws IOException { +this.textToVector = modelStore.getModel(model); +if (textToVector == null) { +throw new SolrException( +SolrException.ErrorCode.BAD_REQUEST, +"The model requested '" ++ model ++ "' can't be found in the store: " ++ ManagedTextToVectorModelStore.REST_END_POINT); +} + +SolrInputDocument doc = cmd.getSolrInputDocument(); +SolrInputField inputFieldContent = doc.get(inputField); +if (!isNullOrEmpty(inputFieldContent, doc, inputField)) { +String textToVectorise = inputFieldContent.getValue().toString();//add null checks and +float[] vector = textToVector.vectorise(textToVectorise); Review Comment: So text-to-vector in the search case is ephemeral i.e. in case of errors or timeouts the user gets an error or exception back and they may or may not choose to retry. For text-to-vector in the update case, might some users prefer to index only documents with (all the) vectors and others would rather have the document indexed even if some vectors are missing? (I assume but don't know for sure that the `vectorise` call might throw an exception in certain circumstances and if it's not caught that might fail the indexing for the entire document.) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For additional commands, e-mail: issues-h...@solr.apache.org
Re: [PR] SOLR-17632: Text to Vector Update Request Processor [solr]
cpoerschke commented on code in PR #3151: URL: https://github.com/apache/solr/pull/3151#discussion_r1937657567 ## solr/modules/llm/src/java/org/apache/solr/llm/texttovector/update/processor/TextToVectorUpdateProcessor.java: ## @@ -0,0 +1,100 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.solr.llm.texttovector.update.processor; + +import org.apache.solr.common.SolrException; +import org.apache.solr.common.SolrInputDocument; +import org.apache.solr.common.SolrInputField; +import org.apache.solr.llm.texttovector.model.SolrTextToVectorModel; +import org.apache.solr.llm.texttovector.store.rest.ManagedTextToVectorModelStore; +import org.apache.solr.request.SolrQueryRequest; +import org.apache.solr.update.AddUpdateCommand; +import org.apache.solr.update.processor.UpdateRequestProcessor; +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +import java.io.IOException; +import java.lang.invoke.MethodHandles; +import java.util.ArrayList; +import java.util.List; + + +class TextToVectorUpdateProcessor extends UpdateRequestProcessor { +private static final Logger log = LoggerFactory.getLogger(MethodHandles.lookup().lookupClass()); + +private final String inputField; +private final String outputField; +private final String model; +private SolrTextToVectorModel textToVector; +private ManagedTextToVectorModelStore modelStore = null; + +public TextToVectorUpdateProcessor( +String inputField, +String outputField, +String model, +SolrQueryRequest req, +UpdateRequestProcessor next) { +super(next); +this.inputField = inputField; +this.outputField = outputField; +this.model = model; +this.modelStore = ManagedTextToVectorModelStore.getManagedModelStore(req.getCore()); +} + +/** + * @param cmd the update command in input containing the Document to process + * @throws IOException If there is a low-level I/O error + */ +@Override +public void processAdd(AddUpdateCommand cmd) throws IOException { +this.textToVector = modelStore.getModel(model); Review Comment: Wondering re: `textToVector` as class member vs. local variable. edit: if `model` is configured i.e. always the same it could be a class member, perhaps, but if perhaps `model` was a field in the `cmd` document then it would need to be a local variable. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For additional commands, e-mail: issues-h...@solr.apache.org
Re: [PR] SOLR-17632: Text to Vector Update Request Processor [solr]
cpoerschke commented on code in PR #3151: URL: https://github.com/apache/solr/pull/3151#discussion_r1937653570 ## solr/modules/llm/src/java/org/apache/solr/llm/texttovector/update/processor/TextToVectorUpdateProcessor.java: ## @@ -0,0 +1,100 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.solr.llm.texttovector.update.processor; + +import org.apache.solr.common.SolrException; +import org.apache.solr.common.SolrInputDocument; +import org.apache.solr.common.SolrInputField; +import org.apache.solr.llm.texttovector.model.SolrTextToVectorModel; +import org.apache.solr.llm.texttovector.store.rest.ManagedTextToVectorModelStore; +import org.apache.solr.request.SolrQueryRequest; +import org.apache.solr.update.AddUpdateCommand; +import org.apache.solr.update.processor.UpdateRequestProcessor; +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +import java.io.IOException; +import java.lang.invoke.MethodHandles; +import java.util.ArrayList; +import java.util.List; + + +class TextToVectorUpdateProcessor extends UpdateRequestProcessor { +private static final Logger log = LoggerFactory.getLogger(MethodHandles.lookup().lookupClass()); + +private final String inputField; +private final String outputField; +private final String model; +private SolrTextToVectorModel textToVector; +private ManagedTextToVectorModelStore modelStore = null; Review Comment: ```suggestion private final ManagedTextToVectorModelStore modelStore; ``` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For additional commands, e-mail: issues-h...@solr.apache.org
Re: [PR] SOLR-17632: Text to Vector Update Request Processor [solr]
cpoerschke commented on code in PR #3151: URL: https://github.com/apache/solr/pull/3151#discussion_r1937657567 ## solr/modules/llm/src/java/org/apache/solr/llm/texttovector/update/processor/TextToVectorUpdateProcessor.java: ## @@ -0,0 +1,100 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.solr.llm.texttovector.update.processor; + +import org.apache.solr.common.SolrException; +import org.apache.solr.common.SolrInputDocument; +import org.apache.solr.common.SolrInputField; +import org.apache.solr.llm.texttovector.model.SolrTextToVectorModel; +import org.apache.solr.llm.texttovector.store.rest.ManagedTextToVectorModelStore; +import org.apache.solr.request.SolrQueryRequest; +import org.apache.solr.update.AddUpdateCommand; +import org.apache.solr.update.processor.UpdateRequestProcessor; +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +import java.io.IOException; +import java.lang.invoke.MethodHandles; +import java.util.ArrayList; +import java.util.List; + + +class TextToVectorUpdateProcessor extends UpdateRequestProcessor { +private static final Logger log = LoggerFactory.getLogger(MethodHandles.lookup().lookupClass()); + +private final String inputField; +private final String outputField; +private final String model; +private SolrTextToVectorModel textToVector; +private ManagedTextToVectorModelStore modelStore = null; + +public TextToVectorUpdateProcessor( +String inputField, +String outputField, +String model, +SolrQueryRequest req, +UpdateRequestProcessor next) { +super(next); +this.inputField = inputField; +this.outputField = outputField; +this.model = model; +this.modelStore = ManagedTextToVectorModelStore.getManagedModelStore(req.getCore()); +} + +/** + * @param cmd the update command in input containing the Document to process + * @throws IOException If there is a low-level I/O error + */ +@Override +public void processAdd(AddUpdateCommand cmd) throws IOException { +this.textToVector = modelStore.getModel(model); Review Comment: Wondering re: `textToVector` as class member vs. local variable. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For additional commands, e-mail: issues-h...@solr.apache.org
Re: [PR] SOLR-17632: Text to Vector Update Request Processor [solr]
cpoerschke commented on code in PR #3151: URL: https://github.com/apache/solr/pull/3151#discussion_r1937651836 ## solr/modules/llm/src/java/org/apache/solr/llm/texttovector/update/processor/TextToVectorUpdateProcessorFactory.java: ## @@ -0,0 +1,97 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.solr.llm.texttovector.update.processor; + +import org.apache.solr.common.SolrException; +import org.apache.solr.common.params.SolrParams; +import org.apache.solr.common.util.NamedList; +import org.apache.solr.request.SolrQueryRequest; +import org.apache.solr.response.SolrQueryResponse; +import org.apache.solr.schema.DenseVectorField; +import org.apache.solr.schema.FieldType; +import org.apache.solr.schema.SchemaField; +import org.apache.solr.update.processor.UpdateRequestProcessor; +import org.apache.solr.update.processor.UpdateRequestProcessorFactory; + +/** + * This class implements an UpdateProcessorFactory for the Text To Vector Update Processor. + */ +public class TextToVectorUpdateProcessorFactory extends UpdateRequestProcessorFactory { +private static final String INPUT_FIELD_PARAM = "inputField"; +private static final String OUTPUT_FIELD_PARAM = "outputField"; +private static final String MODEL_NAME = "model"; + +String inputField; +String outputField; +String modelName; +SolrParams params; Review Comment: ```suggestion private String inputField; private String outputField; private String modelName; private SolrParams params; ``` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For additional commands, e-mail: issues-h...@solr.apache.org
Re: [PR] SOLR-17632: Text to Vector Update Request Processor [solr]
alessandrobenedetti commented on PR #3151: URL: https://github.com/apache/solr/pull/3151#issuecomment-2627858272 Let's give it a first round of discussion/brainstorming/improvements. Then I'll adjust the gradle check, documentation and changes. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For additional commands, e-mail: issues-h...@solr.apache.org