Re: Ignore unknown fields when indexing PDFs

2024-06-04 Thread Thomas Corthals
When you extra text from PDF with Tika, it includes additional metadata fields. This is the document I get after executing the example from the ref guide at https://solr.apache.org/guide/solr/latest/indexing-guide/indexing-with-tika.html#trying-out-solr-cell { "responseHeader":{ "status":0,

Re: NGinx .conf Setup for Solr

2024-06-04 Thread Thomas Corthals
Hi Lee, Solr was installed on a different server that doesn't run anything else so we can tweak the resources independently of what the web server needs. But I don't see a reason why this workaround wouldn't work with a Solr install on the same server. The subdomain isn't tied to Solr in any

Re: NGinx .conf Setup for Solr

2024-06-03 Thread Thomas Corthals
I've done a setup where I use nginx on Plesk to proxy to a separate Solr server. I created a new subscription in Plesk and used a subdomain for it, which gives me a distinct hostname on the same IP address. This is what my nginx configuration looks like: auth_basic "Solr"; auth_basic_user_file

Re: solr query sanitizer?

2024-05-29 Thread Thomas Corthals
Solarium (a PHP client for Solr) has a helper method to escape search terms that uses a regex to escape special characters. https://github.com/solariumphp/solarium/blob/c2744ff706a2f0be148a45d702700fc346429679/src/Core/Query/Helper.php#L82 Thomas Op wo 29 mei 2024 om 16:11 schreef Dmitri Maziuk

Re: Invalid JSON on solr9 when using a binary field

2024-04-18 Thread Thomas Corthals
Stoney : > Hi Thomas, > That link doesn’t seem to work for me (might be that my corporate AV is > messing with it as it rewrites it to some outlook safe link!) – do you have > an issue ref? > > We’re trying to go to 9.5.0, which is the latest. > > From: Thomas Corthals >

Re: Invalid JSON on solr9 when using a binary field

2024-04-17 Thread Thomas Corthals
Hi Karl, What is the exact version of Solr 9 you're trying to upgrade to? This sounds very similar to this issue: https://lists.apache.org/list?users@solr.apache.org:gte=0d:_subject=Invalid%20JSON%20response%20with%20UUID%20field Thomas Op wo 17 apr 2024 om 16:55 schreef Karl Stoney : > Hi, >

Re: Apache Solr Query Issue with huge data

2024-04-05 Thread Thomas Corthals
of documents. Thomas Op vr 5 apr 2024 om 12:10 schreef prasad bezavada : > Hello Thomas Corthals, > > Thank you very much for your valuable reply. I am trying to use cursors, > but for the first query also its taking so much time to get the results, > and next query I am getting hea

Re: Atomic updates with CBOR?

2024-04-05 Thread Thomas Corthals
can be added > > On Thu, 4 Apr 2024, 7:29 am Thomas Corthals, > wrote: > > > Hi, > > > > > > Is it possible to issue atomic updates with the CBOR update format? It > > isn't mentioned in the ref guide and it's a quite inconvenient format to > > whip up a quick curl command to test this. > > > > > > Thomas > > >

Re: Apache Solr Query Issue with huge data

2024-04-05 Thread Thomas Corthals
Hi Prasad, This is expected with "deep paging": https://solr.apache.org/guide/solr/latest/query-guide/pagination-of-results.html#performance-problems-with-deep-paging Have a look at cursors instead, that should solve your problem:

Atomic updates with CBOR?

2024-04-03 Thread Thomas Corthals
Hi, Is it possible to issue atomic updates with the CBOR update format? It isn't mentioned in the ref guide and it's a quite inconvenient format to whip up a quick curl command to test this. Thomas

Re: Bury search results with bq

2024-03-13 Thread Thomas Corthals
If you know the IDs of the documents and you want to exclude them entirely (not just sort them last), have a look if the Query Elevation Component suits your needs. https://solr.apache.org/guide/solr/latest/query-guide/query-elevation-component.html Thomas Op wo 13 mrt 2024 om 17:36 schreef

Re: Solr 9.5.0 returns no history with v2 logging API

2024-02-22 Thread Thomas Corthals
chael Gibney > wrote: > > > > It looks like this is related to > > https://issues.apache.org/jira/browse/SOLR-17063 > > > > I'm investigating, but it looks like it would be appropriate to open a > > Jira issue for this. > > > > Thanks for reporting!

Solr 9.5.0 returns no history with v2 logging API

2024-02-21 Thread Thomas Corthals
Hi I've been using api/node/logging/messages since 9.3 and api/node/logging with older versions to get logs and with Solr 9.5.0 this no longer outputs the history and causes an ERROR instead. I can get the logs with the v1 API at solr/admin/info/logging so they are stored correctly. On a freshly

Re: Upgrade SolrCloud with external cluster Apache Zookeeper

2024-01-17 Thread Thomas Corthals
olr 7.7 to 8.11 and then solr version 8.11 to 9 works? > > And what about Apache Zookeeper, can i upgrade to apache zookeeper 3.9 with > solr 7.7? > > thanks. > > pd. sorry about my english writting > > > El mar, 16 ene 2024 a las 15:15, Thomas Corthals () > escribió:

Re: Upgrade SolrCloud with external cluster Apache Zookeeper

2024-01-16 Thread Thomas Corthals
Hi Sandra, You can't upgrade more than one major Solr version. Solr 9 will refuse to work with a core that has ever been touched by Solr 7. My advice for any upgrade is to set up the new version from scratch and rebuild your index from the authoritative data source. Thomas Op di 16 jan 2024

Re: Invalid PHPS response for Luke request

2023-12-04 Thread Thomas Corthals
ot SolrDocument really (see > LukeRequestHandler.handleRequestBody). > Do you really need /luke can't you obtain a doc via /select or /get? > > On Sun, Dec 3, 2023 at 11:49 AM Thomas Corthals > wrote: > > > Hi all, > > > > > > The output of a Luke request

Invalid PHPS response for Luke request

2023-12-03 Thread Thomas Corthals
Hi all, The output of a Luke request for a specific document with wt=phps can't be unserialized in PHP because it contains an error. > curl 'http://localhost:8983/solr/techproducts/admin/luke?id=apple=phps' The output is structured like this, I'm omitting some details for brevity.

Re: question on docker example from guide

2023-10-26 Thread Thomas Corthals
Hi Vince, If you want to run the techproducts examples, these commands from https://stackoverflow.com/a/55171062 do the trick: - docker run --name my_solr -d -p 8983:8983 -t solr - docker exec -it --user=solr my_solr bin/solr create_core -c techproducts - docker exec -it --user=solr

Re: Paging a delete query?

2023-10-23 Thread Thomas Corthals
Hi Koen, You'll have to implement that on the client side. If you happen to use PHP, the Solarium PHP client has a plugin that does just that: https://solarium.readthedocs.io/en/stable/plugins/#buffereddelete-plugin (Full disclosure: I wrote the plugin.) Thomas Op ma 23 okt 2023 om 10:47

Re: what is SOLR syntax to remove duplicated documents

2023-10-23 Thread Thomas Corthals
Probably not very helpful for the original question, but for the sake of completeness: you can use the Lucene documentID with the Luke Request Handler. https://solr.apache.org/guide/solr/latest/indexing-guide/luke-request-handler.html You can not use it as a reliable identifier for your Solr

Re: what is SOLR syntax to remove duplicated documents

2023-10-22 Thread Thomas Corthals
Hi Vince I would fix whatever indexing process caused the doubles and just rebuild the index from the source data. That's something you should always be able to do anyway. Thomas Op zo 22 okt 2023 om 14:38 schreef Vince McMahon < sippingonesandze...@gmail.com>: > all fields are the same will

[jira] [Commented] (SOLR-6853) solr.ManagedSynonymFilterFactory/ManagedStopwordFilterFactory: URLEncoding - Not able to delete Synonyms/Stopwords with special characters

2023-10-13 Thread Thomas Corthals (Jira)
[ https://issues.apache.org/jira/browse/SOLR-6853?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17775074#comment-17775074 ] Thomas Corthals commented on SOLR-6853: --- [~epugh] It looks like the PR only fixed an issue

Re: Response type of facet.field (classic) results

2023-07-16 Thread Thomas Corthals
Hi Ishan, If you want a JSON object with key => value pairs instead of a JSON array with alternating keys and values in the response, you can add =map to your request. https://solr.apache.org/guide/solr/latest/query-guide/response-writers.html#json-nl Thomas Op zo 16 jul 2023 om 01:41 schreef

Re: Response type of facet.field (classic) results

2023-07-16 Thread Thomas Corthals
Hi Ishan, If you want a JSON object with key => value pairs instead of a JSON array with alternating keys and values in the response, you can add =map to your request. https://solr.apache.org/guide/solr/latest/query-guide/response-writers.html#json-nl Thomas Op zo 16 jul 2023 om 01:41 schreef

Re: Deleting document on wrong shard?

2023-05-25 Thread Thomas Corthals
Hi Walter Deleting multiple IDs at once with JSON is mentioned here: https://solr.apache.org/guide/solr/latest/indexing-guide/indexing-with-update-handlers.html#sending-json-update-commands Or a list of document IDs: > > { "delete":["id1","id2"] } > > Thomas Op wo 24 mei 2023 om 22:23 schreef

Re: Copy-field doesn't seem to be working as expected

2023-05-20 Thread Thomas Corthals
Op za 20 mei 2023 om 21:18 schreef Shawn Heisey : > Agreed. There are many situations outside of version upgrades where > rebuilding the index from scratch is an absolute requirement. It is > something all Solr users need to be able to do at ANY time. I used to > maintain an index where a full

Re: Streaming of Documents with text columns (_txt)

2023-05-17 Thread Thomas Corthals
Have a look at the Schema API: https://solr.apache.org/guide/solr/latest/indexing-guide/schema-api.html Thomas Op ma 15 mei 2023 om 15:05 schreef Subhasis Patra : > Thanks for your response. I am using dynamic schema. But I want to copy > all _txt fields to _s fields. I know if I add copy

Re: Delete silently failing.

2023-03-07 Thread Thomas Corthals
alter Underwood : > Is it supposed to be: > > {“delete”: {“id”: "1E089335-892C-41F6-B767-632EB5361775”}} > > wunder > Walter Underwood > wun...@wunderwood.org > https://observer.wunderwood.org/ (my blog) > > > On Mar 7, 2023, at 1:20 PM, Thomas Corthals > wrote: &g

Re: Delete silently failing.

2023-03-07 Thread Thomas Corthals
Got blindsided by the quotes and didn't notice you already have commit=true as a URL parameter. That should already cover my suggestion. Op di 7 mrt 2023 om 22:06 schreef Thomas Corthals : > Hi Matthew, > > There seems to be something strange going on with single quotes and > backsl

Re: Delete silently failing.

2023-03-07 Thread Thomas Corthals
Hi Matthew, There seems to be something strange going on with single quotes and backslashes around your delete command. Best to use double quotes inside a single quoted command argument when sending JSON like this. Maybe you queried too soon, before the change was committed to the index? You can

[jira] [Commented] (SOLR-16183) XML Loader: support indexing single nested child document

2023-02-28 Thread Thomas Corthals (Jira)
[ https://issues.apache.org/jira/browse/SOLR-16183?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17694611#comment-17694611 ] Thomas Corthals commented on SOLR-16183: The {{name="child"}} on the doc is current

[jira] [Commented] (SOLR-16183) XML Loader: support indexing single nested child document

2023-02-27 Thread Thomas Corthals (Jira)
[ https://issues.apache.org/jira/browse/SOLR-16183?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17694003#comment-17694003 ] Thomas Corthals commented on SOLR-16183: Hi, [~vinayakhegde]  They're both [Nested Documents

Re: PKIAuthenticationPlugin error

2023-02-26 Thread Thomas Corthals
Hi Hari Are the clocks on both the server and the client running correctly? My guess is that one of them has drifted by more than 15 seconds. Thomas Op zo 26 feb. 2023 12:24 schreef HariBabu kuruva : > I see the below error repeatedly in our Solr logs. Please suggest how to > avoid this. > >

Re: synonym file

2023-02-22 Thread Thomas Corthals
I let the content team define their own synonyms as they see fit in the backend and use Managed Resources to push their changes to Solr. https://solr.apache.org/guide/solr/latest/configuration-guide/managed-resources.html Thomas Op wo 22 feb. 2023 om 20:32 schreef Mikhail Khludnev : > I

Re: Nested Fields Schema Definition

2023-02-18 Thread Thomas Corthals
Hi Christoph All fields of the parent and child documents must match a field or dynamicField definition in the schema. Those field definitions are the same for all types of documents, whether it is a parent or a child. This doesn't only mean they have to be the same fieldType, it also means you

Re: Importing Data from MySql

2023-01-13 Thread Thomas Corthals
Build an indexer that can send updates in batches. It'll be faster than sending each document in a separate request. Op vr 13 jan. 2023 om 16:41 schreef Dave : > Yeah, it’s trivial building your own indexer in any language that can read > a db. Also I wouldn’t trust the dih on its own even when

Re: Some questions about Solr NOT query

2023-01-06 Thread Thomas Corthals
Hi A negative query is a subtraction from a set of matched documents. With (*:* NOT user_id:0) you are subtracting from the set of all documents in the index first, then intersecting with the documents that match the other clauses. With (NOT user_id:0) you are subtracting directly from the

Re: Combine facet counts across fields

2023-01-04 Thread Thomas Corthals
Post processing on two fields means you might have to implement your own pagination too instead of using facet.limit and facet.offset. Potentially over millions of usernames. It also renders facet.mincount useless. Thomas Op wo 4 jan. 2023 om 18:59 schreef Mikhail Khludnev : > Hello Julia, >

Re: Combine facet counts across fields

2023-01-04 Thread Thomas Corthals
Hi Julia, As I'm working with indexes that are updated infrequently and queried very frequently, I would duplicate that data with copyField directives at index time. Writing a custom facet processor comes with the risk that it might break with a Solr upgrade. Are you talking millions of unique

Re: Multiple cores

2022-12-28 Thread Thomas Corthals
For our corpus, term frequency gets in the way of how we want to rank search results rather than being helpful. I put this in our schema to effectively turn Okapi BM25 into BM15: 0 Thomas Op wo 28 dec. 2022 om 14:35 schreef Eric

Re: child documents anyone?

2022-12-23 Thread Thomas Corthals
Hi Dima, If your schema is configured correctly for atomic updates, you can add child documents atomically as well. If you want to add/replace/delete individual child documents, you'll have to use the JSON request format because those updates aren't supported through XML (SOLR-12677

Re: Reg. Field Level Update (Atomic update) (Solr 8.11.1) behavior

2022-12-21 Thread Thomas Corthals
Hi Rajeswari, I think you're just "lucking out" because you're setting that field specifically. There is no need to get the stored content from the index as you're providing the content in your update query. Try atomically updating another field and see if you can still find this document by

Re: Is there a way to shape a query response from SOLR similar to the way the Script Update Processor can transform an update payload/document?

2022-12-08 Thread Thomas Corthals
Hi Matthew Do you mean you want to relabel the field names in the reply? They can be aliased. https://solr.apache.org/guide/solr/latest/query-guide/common-query-parameters.html#field-name-aliases IIRC, there's no way to guarantee the order of the fields in the response. It's probably the order

Re: SOLR adding , to strings erroneously

2022-12-05 Thread Thomas Corthals
Op za 3 dec. 2022 om 18:47 schreef Shawn Heisey : > On 12/3/22 10:38, dmitri maziuk wrote: > > On 2022-12-02 7:41 PM, Shawn Heisey wrote: > > > >> I'm curious as to why those entities are displaying as text instead > >> of being interpreted by the browser as a zero-width space. > > > > I am

[jira] [Commented] (SOLR-16469) On some systems is inserted after every comma

2022-12-02 Thread Thomas Corthals (Jira)
[ https://issues.apache.org/jira/browse/SOLR-16469?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17642669#comment-17642669 ] Thomas Corthals commented on SOLR-16469: This seems to do the trick for the admin UI. {code:js

Re: SOLR adding , to strings erroneously

2022-12-01 Thread Thomas Corthals
Op do 1 dec. 2022 om 17:12 schreef dmitri maziuk : > On 2022-12-01 6:41 AM, Eric Pugh wrote: > > Shawn, > > > > Have we received a couple of mentions of this? Or am I misremembering? > Do we need to open a JIRA and change how logging.js works? > > >

Re: Update all rows with a value for one specific field in one core

2022-11-28 Thread Thomas Corthals
Hi Shushuai, Like Mikhail said, bulking your updates will be way faster than sending individual update requests. Depending on how large (or should I say small) the actual value is you're setting, you might be able to send a few hundred or even a few thousand updates per request. If you're

Re: [External] : Re: Solr 8.11.1 Optimise command not deleting the deletedDoc

2022-11-25 Thread Thomas Corthals
Hi Rajan It's explained in the ref guide. https://solr.apache.org/guide/8_11/uploading-data-with-index-handlers.html#commit-and-optimize-during-updates Thomas Op vr 25 nov. 2022 om 16:08 schreef Nagarajan Muthupandian < nagarajan.muthupand...@oracle.com>: > Hi Bernd, > > Thank you for the

Re: solr and dovecot: high load

2022-11-24 Thread Thomas Corthals
If it's always around the same time, one of the things to check is cronjobs that run on the server and on all clients at that time. This includes cronjobs that are scheduled to start at that time, and those that start before it and might run long enough to reach a 'critical mass' by that time.

Re: possible bug in import and dynamic fields?

2022-11-24 Thread Thomas Corthals
Hi Dima, Which version of Solr are you using? I recently ran into an issue with the price field and atomic updates when nested documents are involved. That happened against Solr 9.0.0 with code that ran fine against Solr 8.11.2. I'm wondering if it could be related (although I got a different

Re: SOLR upgrade from 5.4.1 version to 8.11.2

2022-11-03 Thread Thomas Corthals
Hello Jaivik, You can upgrade to the next major version (Solr X → X+1) but not to the one after that (Solr X → X+2), even if you try to do it one major version at a time (Solr X → X+1 → X+2). This is a consequence of Lucene's back-compatibility strategy. The index "knows" in which major version

Re: I cannot get nested objects to index - with image links

2022-11-03 Thread Thomas Corthals
with a sick > child, but the pandemic amplified that because there is even more unknown > now.”\\nSo, how could their staff keep serving families with kids in > need? \\nThe team at RMHCI decided to start assembling lunch boxes > filled with meals for families with kids at St. Luke’s Ch

Re: I cannot get nested objects to index - with image links

2022-10-31 Thread Thomas Corthals
t;: "Blog Post Name", > "Type": "Single-Line Text", > "Value": "Ronald McDonald House, St. Luke’s Children’s find > new ways to help families" > }, > { "Name": "Blog Summary", "Type&quo

Re: 8.11 docs "Sending JSON Update Commands" bug?

2022-10-31 Thread Thomas Corthals
Solr does deviate from the 'does not assign any significance to the ordering of name/value pairs' part of that spec though. The order of "add"s and "delete"s within an update request does matter. Thomas Op ma 31 okt. 2022 om 21:50 schreef Walter Underwood : > Duplicate keys are somewhat

Re: Solr 9 possible analysis error with currency field and nested child

2022-10-28 Thread Thomas Corthals
he > smallest part of the parent/child structure. > > Jan > > > 27. okt. 2022 kl. 23:51 skrev Thomas Corthals : > > > > Op do 27 okt. 2022 om 22:04 schreef Shawn Heisey <mailto:apa...@elyograg.org>>: > > > >> On 10/19/22 16:41, Thomas

Re: HTTP errors POSTing to 8.11.2

2022-10-28 Thread Thomas Corthals
Op vr 28 okt. 2022 om 17:37 schreef dmitri maziuk : > It's a clean stand-alone install. It's not going through any proxies, > the scripts are erroring out when run on the same server too, and it > being python, the complete http conversation is a bit hard to get to. Hi Dima Whenever I need the

Re: Solr 9 possible analysis error with currency field and nested child

2022-10-27 Thread Thomas Corthals
Op do 27 okt. 2022 om 22:04 schreef Shawn Heisey : > On 10/19/22 16:41, Thomas Corthals wrote: > > I'm running into an exception with Solr 9.0.0 for a request that works > fine > > with Solr 8.11.2 and I have no idea why. > > > However, if I don't do a commit betwee

Re: Solr 9 possible analysis error with currency field and nested child

2022-10-27 Thread Thomas Corthals
Bumping this to the list again in case anyone has any insights before I open an issue in JIRA for this. Op do 20 okt. 2022 om 00:41 schreef Thomas Corthals : > Hi, > > > I'm running into an exception with Solr 9.0.0 for a request that works > fine with Solr 8.11.2 and I h

Re: I cannot get nested objects to index - with image links

2022-10-27 Thread Thomas Corthals
https://i.postimg.cc/RZtcF8bB/Screenshot-2022-10-26-170222.png > > -- > *From:* Thomas Corthals > *Sent:* Tuesday, October 25, 2022 1:28 AM > *To:* users@solr.apache.org > *Subject:* Re: I cannot get nested objects to index - with image links >

Re: I cannot get nested objects to index - with image links

2022-10-25 Thread Thomas Corthals
Hi Matthew The (pseudo-)field in which you want to put the nested documents ("content" in your example) should not be added to the schema. The actual fields of the nested document (id, stuff1, stuff2) need to match an explicit field definition or a dynamicField definition in your schema though.

Solr 9 possible analysis error with currency field and nested child

2022-10-19 Thread Thomas Corthals
;, "root-error-class","java.lang.IllegalArgumentException"], "msg":"Exception writing document id parent to the index; possible analysis error: cannot change field \"price_cl_ns\" from doc values type=NONE to inconsistent doc values type=NUMERIC", "code":400}} However, if I don't do a commit between the two adds, I don't get the error. Did something change between Solr 8 and 9 that I have to account for in my schema or my update requests? Or is this a bug? Kind regards, Thomas Corthals

Re: Fastest way to index data to solr

2022-09-30 Thread Thomas Corthals
Hi Gus, I have a followup question. Is JSON parsed faster than XML by Solr if they represent the exact same documents? Thomas Op vr 30 sep. 2022 om 06:58 schreef Gus Heck : > If you are using a non-java language you can use JSON. >

Re: User access to deployed Solr instance

2022-09-27 Thread Thomas Corthals
Op di 27 sep. 2022 om 04:58 schreef Shawn Heisey : > On 9/26/22 15:06, Victoria Stuart (VictoriasJourney.com) wrote: > > To clarify - in my case the web page has an input / search element that > connects to Solr (running in the background) via an Ajax script. > > This is a very bad idea. You've

Re: Atomic indexing as default indexing

2022-09-23 Thread Thomas Corthals
Op vr 23 sep. 2022 om 18:17 schreef Shawn Heisey : > On 9/23/22 09:51, gnandre wrote: > > Is there a way to make atomic indexing default? > > > > Say, even if some clients send non-atomic indexing requests, it should > get > > converted to atomic indexing requests on Solr end, is that possible? >

Re: How is fieldNorm calculated when omitNorms is set to true?

2022-09-20 Thread Thomas Corthals
onfiguration example. > > Kind regards, > Stian > > > > søn. 18. sep. 2022 kl. 16:21 skrev Thomas Corthals >: > > > Hi Stian, > > > > We have the same issue with our documents. I fixed that by setting b = 0 > in > > our schema for BM25 similarity. &g

Re: How is fieldNorm calculated when omitNorms is set to true?

2022-09-18 Thread Thomas Corthals
Hi Stian, We have the same issue with our documents. I fixed that by setting b = 0 in our schema for BM25 similarity. 0 I don't know if BM25 can be used with your version of Solr. Personally I think it's worth upgrading for. Thomas Op za 17 sep. 2022 om 19:30 schreef Stian

Re: Using substring functionality to reach field value in solt

2022-09-15 Thread Thomas Corthals
Or if it's inconvenient to do in the code at query time: get the substring at index time, store it as a separate field, and retrieve that value at query time. Op do 15 sep. 2022 om 16:23 schreef Dave : > You would need to do that in the code end of reading the document from the > index. Search

Re: Allow anonymous search on otherwise Basic Auth-protected Solr instance?

2022-09-02 Thread Thomas Corthals
If you really want to take all of the data, use a cursorMark.  Op vr 2 sep. 2022 om 18:38 schreef Dave : > Exactly. This is a serious security loophole you would be opening up. What > if I just ask for *:* and 5 rows to just, take all of your data, > while crashing your server, and just

[jira] [Commented] (SOLR-16293) Luke request fails for document with a binary field

2022-07-13 Thread Thomas Corthals (Jira)
[ https://issues.apache.org/jira/browse/SOLR-16293?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17566409#comment-17566409 ] Thomas Corthals commented on SOLR-16293: This is the full response of the Luke request. As you

[jira] [Created] (SOLR-16293) Luke request fails for document with a binary field

2022-07-13 Thread Thomas Corthals (Jira)
Thomas Corthals created SOLR-16293: -- Summary: Luke request fails for document with a binary field Key: SOLR-16293 URL: https://issues.apache.org/jira/browse/SOLR-16293 Project: Solr Issue

Re: Unsubscribe me

2022-07-13 Thread Thomas Corthals
And if all else fails, set up a message rule to silently discard them. Op wo 13 jul. 2022 om 08:44 schreef Michał Świątkowski : > Hi, > > To remove address from the list, send a message to > users-unsubscr...@solr.apache.org > > Regards, > Michal > > On 7/13/22 04:22, Gus Heck wrote: > > Have

Re: Transfer to a new server

2022-07-11 Thread Thomas Corthals
Hello Mike, If possible, just rebuild it from the original source on the new server. Regards, Thomas Op ma 11 jul. 2022 om 13:28 schreef Mike : > Hello! > > How can I transfer a 500 GB Solr index to a new server? > > Thanks >

Re: Solr eats up all the memory

2022-07-04 Thread Thomas Corthals
Hello Mike, If possible, run Solr on a separate machine. You're still going to need to spec it out and configure it to your needs, but at least your client code will keep running. Thomas Op ma 4 jul. 2022 11:01 schreef Mike : > Hello! > > My Solr index size is around 500GB and I have 64GB of

[jira] [Commented] (SOLR-16274) HEAD request for managed resource → 500 Server Error

2022-06-29 Thread Thomas Corthals (Jira)
[ https://issues.apache.org/jira/browse/SOLR-16274?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17560590#comment-17560590 ] Thomas Corthals commented on SOLR-16274: Here's a comparison between 8.11.1 and 8.11.2

[jira] [Created] (SOLR-16274) HEAD request for managed resource → 500 Server Error

2022-06-29 Thread Thomas Corthals (Jira)
Title: Message Title Thomas Corthals

Luke request for document with a binary field

2022-06-22 Thread Thomas Corthals
Hi, I'm trying to get document details from the Luke request handler and it fails when my document contains a binary field. Starting from the techproducts examples, which already contains: I added this to the schema: Indexed a simple document which I can get back in a query: {

Re: Re-index after upgrade

2022-06-12 Thread Thomas Corthals
Or if you have the resources, set up a separate machine for the new Solr version and reindex and test against that one before switching. Op zo 12 jun. 2022 20:21 schreef Dave : > You don’t need a new core/collection, just reindex everything again. > Ideally since you’re using standalone (way

Re: Re: Re: Unique key field

2022-06-07 Thread Thomas Corthals
You can ask Luke: http://localhost:8983/solr/techproducts/admin/luke?show=all=id On Solr 8.11.1, I get this snippet as part of the output: "fields":{ "id":{ "type":"string", "schema":"I-S-U-OF-l", If it had docValues="true" in the schema, the fourth flag would be a D

Re: How to select the class of fields

2022-05-17 Thread Thomas Corthals
Hi Zhiqing, It is very common with Solr to have the same value indexed in different fields with different analyses. If you do it with copyField, you don't even have to worry about it at index time. Thomas Op di 17 mei 2022 om 22:21 schreef WU, Zhiqing : > Hello, > We are going to change the

Re: [ANNOUNCE] Apache Solr 9.0.0 released

2022-05-13 Thread Thomas Corthals
Congratulations to the team and all volunteers! Op do 12 mei 2022 om 01:40 schreef Jan Høydahl : > * Docker image creation is now a part of the Apache Solr Github repo. > Any idea when the official image will be available on Docker Hub? Thomas

Re: Indexing "single nested child" in XML

2022-05-05 Thread Thomas Corthals
only read Java and don't really speak it. Thomas Op zo 1 mei 2022 om 22:44 schreef Mikhail Khludnev : > Hello, Thomas. > > I think we never think about singleton as a special case, never distinguish > it from array. > > > On Sun, May 1, 2022 at 5:03 PM Thomas Corthals >

[jira] [Commented] (SOLR-16183) XML Loader: support indexing single nested child document

2022-05-05 Thread Thomas Corthals (Jira)
[ https://issues.apache.org/jira/browse/SOLR-16183?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17532237#comment-17532237 ] Thomas Corthals commented on SOLR-16183: The JSON Loader recognises both "single-v

[jira] [Created] (SOLR-16183) XML Loader: support indexing single nested child document

2022-05-05 Thread Thomas Corthals (Jira)
Thomas Corthals created SOLR-16183: -- Summary: XML Loader: support indexing single nested child document Key: SOLR-16183 URL: https://issues.apache.org/jira/browse/SOLR-16183 Project: Solr

Re: Indexing "single nested child" in XML

2022-05-01 Thread Thomas Corthals
"docs":[ { "id":"1", "single_child":[ { "id":"2"}], "children":[ { "id":"3"}, { "id":"4

Indexing "single nested child" in XML

2022-04-29 Thread Thomas Corthals
Hi, In a JSON request, you can add nested child documents as a single document or an array of documents. JSON data: { "id": "1", "single_child": { "id": "2" }, "children": [{ "id": "3" }, { "id": "4" }] } Response: { "responseHeader":{ "status":0,

Re: Is solr what I want, or something else?

2022-04-19 Thread Thomas Corthals
Op ma 18 apr. 2022 02:05 schreef Shawn Heisey : > On 4/17/2022 10:55 AM, Jay Scott wrote: > > i'm going to watch some tutorials and see if they'll show > > me what i need. i still have a feeling solr is much more > > than what i need, but, oh, well. it won't hurt me to learn > > something new.

Re: SOLR TF/IDF factor removal.

2022-04-14 Thread Thomas Corthals
You can tweak the parameters of BM25 similarity: https://solr.apache.org/docs/8_1_1/solr-core/org/apache/solr/search/similarities/BM25SimilarityFactory.html IIRC, the similarity becomes a constant with k1 = 0. 0 Thomas Op do 14 apr. 2022 om 15:51 schreef Vincenzo D'Amore : >

Re: Select Random set of data in SOLR

2022-04-11 Thread Thomas Corthals
Hi Fiz, Does RandomSortField suit your needs? https://solr.apache.org/docs/8_11_1/solr-core/org/apache/solr/schema/RandomSortField.html Thomas Op ma 11 apr. 2022 om 09:06 schreef Fiz N : > Hi SOLR experts, > > In my current project, we have a requirement to select random set of data > of N

Re: Regarding indexing data in different cores or same core with different entities.

2022-04-11 Thread Thomas Corthals
We have a similar setup where entities of different types all go in a single core. Folding, stemming, managed synonyms … have to be the same for all entity types. I find it easier to only have to keep one schema up to date with business needs. Adding a new entity type to the index can usually be

Re: Solr Cloud - Query with results around 2 million records time out.

2022-04-05 Thread Thomas Corthals
Hi Venkat, Do you mean 2 million documents in a single response? You should really consider pagination , preferably using a cursorMark. Regards, Thomas Op di 5 apr. 2022 om 13:40 schreef Puttaganti, Venkat <

Re: R: DIH and import from other core

2022-03-31 Thread Thomas Corthals
Op do 31 mrt. 2022 om 18:05 schreef dmitri maziuk : > On 2022-03-31 9:29 AM, Tealdi Paolo wrote: > > Hi Eric > > > > Many thanks for the answer. > > I noticed that reindexcollection seems to be SLOWER than DIH import. > > (Warning: there be python there) > > This is trimmed down from a working

Re: RE : Solr Date Range filtering using datapointfield (solr 8.8.2)

2022-03-21 Thread Thomas Corthals
Hi Reej, Have you tried with 2022-03-21T15:59:999Z (with a capital Z)? Regards, Thomas Op ma 21 mrt. 2022 om 12:20 schreef Reej Nayagam : > Hi All, > > In solr 4 , we had a date field and in schema, fieldType name = date and > class ="solr.*TrieDateField*" > when moving to solr8, we changed

Re: Return formatted dates from solr query

2022-03-17 Thread Thomas Corthals
If the desired output format is the same for every query, I would store the formatted date in a separate string field when indexing. You can query the date field, but return the string field instead. Thomas Op do 17 mrt. 2022 om 22:08 schreef Shawn Heisey : > On 3/17/22 14:46, Teresa McMains

Re: Duplicate keys in Luke JSON response

2022-02-23 Thread Thomas Corthals
Hi Christine This part of the response is built from a SimpleOrderedMap and json.nl only affects the representation of plain NamedLists. Regards Thomas Op di 22 feb. 2022 om 19:27 schreef Christine Poerschke (BLOOMBERG/ LONDON) : > Hi Thomas, > > JSON responses often can be customised with

Duplicate keys in Luke JSON response

2022-02-19 Thread Thomas Corthals
The Luke Request Handler returns duplicate object keys in the JSON response for multiValued fields. Only the last value survives the trip through a decoder. E.g. http://localhost:8983/solr/techproducts/admin/luke?show=doc=SP2514N Snippet of the response: "cat":{ "type":"string",

[jira] [Commented] (SOLR-3798) copyField logic in LukeRequestHandler is primitive, doesn't work well with dynamicFields

2022-02-19 Thread Thomas Corthals (Jira)
[ https://issues.apache.org/jira/browse/SOLR-3798?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17494915#comment-17494915 ] Thomas Corthals commented on SOLR-3798: --- {quote}I've been thinking about a related problem

[jira] [Commented] (SOLR-15895) Managed Resources with invalid filename characters on Windows

2022-01-05 Thread Thomas Corthals (Jira)
[ https://issues.apache.org/jira/browse/SOLR-15895?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17469396#comment-17469396 ] Thomas Corthals commented on SOLR-15895: You can see the 0 byte file on disk, but it is not easy

[jira] [Commented] (SOLR-15895) Managed Resources with invalid filename characters on Windows

2022-01-05 Thread Thomas Corthals (Jira)
[ https://issues.apache.org/jira/browse/SOLR-15895?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17469385#comment-17469385 ] Thomas Corthals commented on SOLR-15895: Steps to recreate with Stopwords. Same thing happens

[jira] [Created] (SOLR-15895) Managed Resources with invalid filename characters on Windows

2022-01-05 Thread Thomas Corthals (Jira)
Thomas Corthals created SOLR-15895: -- Summary: Managed Resources with invalid filename characters on Windows Key: SOLR-15895 URL: https://issues.apache.org/jira/browse/SOLR-15895 Project: Solr

[jira] [Commented] (SOLR-15116) Wrong HTTP status for HEAD request

2022-01-04 Thread Thomas Corthals (Jira)
[ https://issues.apache.org/jira/browse/SOLR-15116?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17468909#comment-17468909 ] Thomas Corthals commented on SOLR-15116: I've commented on the PR. My patch was just a guess. 路‍

  1   2   >