Re: Classifier for query intent?

2018-04-03 Thread Georg Sorst
Hi wunder, this sounds like an interesting topic. Can you elaborate a bit on query intent classification? Where does the training data come from? Do you manually assign an intent to a query or can this be done in a (semi-)automatic way? Do you have a fixed list of possible intents (something like

Multi threaded document atomic OR in-place updates

2018-04-03 Thread pravesh
I have a scenario as follows: There are 2 separate threads where each will try to update the same document in a single index for 2 separate fields, for which we are using atomic OR in-place updates. For e.g. id is the unique field in the index thread-1 will update following info: id:1001 field-1

SOLR Cloud: 1500+ threads are in TIMED_WAITING status

2018-04-03 Thread Doss
We have SOLR(7.0.1) cloud 3 VM Linux instances wit 4 CPU, 90 GB RAM with zookeeper (3.4.11) ensemble running on the same machines. We have 130 cores of overall size of 45GB. No Sharding, almost all VMs has the same copy of data. These nodes are under LB. Index Config: = 300 30

Re: Solr cloud schema and schemaless

2018-04-03 Thread Erick Erickson
The schema mode is _per collection_, not per node. So there's no trouble mixing replicas from collection A running schema model 1 with replicas from collection B running a different schema model. That said, schemaless is _not_ recommended for production unless you have total control over the ETL c

Re: Largest number of indexed documents used by Solr

2018-04-03 Thread Yago Riveiro
Hi, In my company we are running a 12 node cluster with 10 (american) Billion documents 12 shards / 2 replicas. We do mainly faceting queries with a very reasonable performance. 36 million documents it's not an issue, you can handle that volume of documents with 2 nodes with SSDs and 32G of ra

Re: Largest number of indexed documents used by Solr

2018-04-03 Thread Walter Underwood
We have a 24 million document index. Our documents are a bit smaller than yours, homework problems. The Hathi Trust probably has the record. They haven’t updated their blog for a while, but they were at 11 million books and billions of pages in 2014. https://www.hathitrust.org/blogslarge-scale-

Re: Largest number of indexed documents used by Solr

2018-04-03 Thread Abhi Basu
We have tested Solr 4.10 with 200 million docs with avg doc size of 250 KB. No issues with performance when using 3 shards / 2 replicas. On Tue, Apr 3, 2018 at 8:12 PM, Steven White wrote: > Hi everyone, > > I'm about to start a project that requires indexing 36 million records > using Solr 7.

Largest number of indexed documents used by Solr

2018-04-03 Thread Steven White
Hi everyone, I'm about to start a project that requires indexing 36 million records using Solr 7.2.1. Each record range from 500 KB to 0.25 MB where the average is 0.1 MB. Has anyone indexed this number of records? What are the things I should worry about? And out of curiosity, what is the lar

Re: How do I create a schema file for FIX data in Solr

2018-04-03 Thread Raymond Xie
I'm talking to the author to find out, thanks. ~~~sent from my cell phone, sorry if there is any typo Adhyan Arizki 于 2018年4月3日周二 下午1:38写道: > Raymond, > > Seems you are having issue with the node environment. Likely the path isn't > registered correctly judging from the error message. Note thou

Solr cloud schema and schemaless

2018-04-03 Thread Kojo
Hi Solrs, We have a Solr cloud running in three nodes. Five collections are running in schema mode and we would like to create another collection running schemalles. Does it fit all together schema and schemales on the same nodes? I am not sure, because on this page it starts solr in schemalles m

RE: 7.2.1 cluster dies within minutes after restart

2018-04-03 Thread Markus Jelsma
To clear things up, this has been resolved; the problem was present in our custom analyzers where we loaded dictionaries in the wrong method. If i remember correctly, we loaded them in createComponents (or the other one, don't have the code here), so it was per-thread loading of dictionaries. A

Re: querying vs. highlighting: complete freedom?

2018-04-03 Thread David Smiley
On Tue, Apr 3, 2018 at 10:51 AM Arturas Mazeika wrote: ... > Similarly, there's the > hl.qparser parameter, but the documentation of that parameter is not as > rich (the documentation says, that the default value is lucene). I am > wondering are there other alternatives available? In case you ar

Re: solr 5.2->7.2, suggester failure

2018-04-03 Thread David Hastings
Ah, Thank you Turns out it was an experiment, so I removed them any ways and its all good now. Since Im here in the configuration for the new 7.x instances I was going to ask a side question. A lot of my Java properties are old or have been tweaked over time from a series of different machines,

Re: solr 5.2->7.2, suggester failure

2018-04-03 Thread Kevin Risden
It looks like there were changes in Lucene 7.0 that limited the size of the automaton to prevent overflowing the stack. https://issues.apache.org/jira/browse/LUCENE-7914 The commit being: https://github.com/apache/lucene-solr/commit/7dde798473d1a8640edafb41f28ad25d17f25a2d Kevin Risden On Tue,

Re: solr 5.2->7.2, suggester failure

2018-04-03 Thread David Hastings
For data, its primarily a lot of garbage, around 200k titles, varying length. im actually looking through my application now to see if I even still use it or if it was an early experiment. I am just finding it odd thats its failing in 7 but does fine on 5 On Tue, Apr 3, 2018 at 2:41 PM, Erick Er

Re: solr 5.2->7.2, suggester failure

2018-04-03 Thread Erick Erickson
What kinds of things go into your title field? On first blush that's a bit odd for a multi-word title field since it treats the entire input as a single string. The code is trying to build a large FST to hold all of this data. Would AnalyzingInfixLookupFactory or similar make more sense? buildOnSt

solr 5.2->7.2, suggester failure

2018-04-03 Thread David Hastings
Hey all, I recently got a 7.2 instance up and running, and it seems to be going well however, I have ran into this when creating one of my indexes, and was wondering if anyone had a quick idea right off the top of their head. solrconfig: fixspell FuzzyLookupFactory string

Re: How do I create a schema file for FIX data in Solr

2018-04-03 Thread Adhyan Arizki
Raymond, Seems you are having issue with the node environment. Likely the path isn't registered correctly judging from the error message. Note though, this is no longer related to Solr issue. On Tue, 3 Apr 2018, 23:00 Raymond Xie, wrote: > Hi Rick, > > Following your suggestion I found https://

Re: Learning to Rank (LTR) with grouping

2018-04-03 Thread ilayaraja
Thanks Roopa. I was expecting that the issue has been fixed in solr 7.0 as per here https://issues.apache.org/jira/browse/SOLR-8776. Let me see why it is still not working on solr-ltr-7.2.1 - --Ilay -- Sent from: http://lucene.472066.n3.nabble.com/Solr-User-f472068.html

Re: Solr 6. 3 Can not talk to ZK Updates are disabled

2018-04-03 Thread Erick Erickson
With beefy machines, one strategy is to create multiple JVMs. For example, if you have one JVM and it hosts 32 replicas, splitting that up to 4 JVMs hosting 8 replicas each. That can allow you to drop down the heap allocated to each. Managing memory is always "exciting" at scale. If you're sorting

Re: some parent documents

2018-04-03 Thread Arturas Mazeika
Hi Mikhail, Thanks a lot for the reply. You mentioned that q=+{!parent which.. v='+text:hello +person:A'} +{!parent which..v='+text:ciao +person:B'} is the way to go. How would it look like precisely for the following collection? { "id":1, "_childDocuments_": [ {"id":"1_1",

Re: SolrCloud 5.2.1 - collection creation error

2018-04-03 Thread bondinthepond
Hi Aaron GIbbons, Need you help. What were the changes you did with the scripts in zookeeper machine. I am stuck with similar problem. Thanks in Advance. -- Sent from: http://lucene.472066.n3.nabble.com/Solr-User-f472068.html

Re: How do I create a schema file for FIX data in Solr

2018-04-03 Thread Raymond Xie
Hi Rick, Following your suggestion I found https://github.com/SunGard-Labs/fix2json which seems to be a fit; I followed the installation instruction and successfully installed the fix2json on my Ubuntu host. sudo npm install -g fix2json I ran the same command as indicated in the git: fix2json

Re: querying vs. highlighting: complete freedom?

2018-04-03 Thread Arturas Mazeika
Hi David, Thanks a lot for the reply and the infos. I suspected that the minimum on the indexing/storage side was that hl.fl need to be "stored". I understand that my expression "minimal requirements" are totally loose/unclear, I wasn't sure how to formulate that as (i) I am not yet sure how to e

Re: some parent documents

2018-04-03 Thread Mikhail Khludnev
Hello, Arturas. TLDR; Please find inline below. On Tue, Apr 3, 2018 at 5:14 PM, Arturas Mazeika wrote: > Hi Solr Fans, > > I am trying to make sense of information retrieval using expressions like > "some parent", "*only parent*", " *all parent*". I am also trying to > understand the syntax "!p

Re: querying vs. highlighting: complete freedom?

2018-04-03 Thread David Smiley
Thanks for your review! On Tue, Apr 3, 2018 at 6:56 AM Arturas Mazeika wrote: ... > What I missed at the beginning of the documentation is the minimal set of > requirements that is reacquired to have highlighting sensible: somehow I > have a feeling that one needs some of the information stored

some parent documents

2018-04-03 Thread Arturas Mazeika
Hi Solr Fans, I am trying to make sense of information retrieval using expressions like "some parent", "*only parent*", " *all parent*". I am also trying to understand the syntax "!parent which" and "!child of". On the technical level, I am reading the following documents: [1] https://lucene.apac

SolrCloud 7.2 problem with leader election

2018-04-03 Thread Gael Jourdan-Weil
Hello, We are trying to upgrade from Solr 6.6 to Solr 7.2.1 and we are using Solr Cloud. Doing some tests with 2 replicas, ZooKeeper doesn't know which one to elect as a leader: ERROR org.apache.solr.cloud.ZkController:getLeader:1206 - Error getting leader from zk org.apache.solr.common.Solr

Re: Trying to Restore older indexes in Solr7.2.1

2018-04-03 Thread Shawn Heisey
On 4/3/2018 3:22 AM, Mugdha Varadkar wrote: is the collection using the compositeId router? Yes collection used of both the versions are using compositeId router, PFA screenshot of the same. If you attached a screenshot, it was lost.  The mailing list does not allow most attachments thr

Re: Need help to get started on Solr, searching get nothing. Thank you very much in advance

2018-04-03 Thread Shawn Heisey
On 4/2/2018 9:00 PM, Raymond Xie wrote: I see there is "/browse" in solrconfig.xml : explicit and name="defaults" with one item of "df" as shown below: _text_ My understanding is I can put whatever fields I want to enable index and searchin

Re: querying vs. highlighting: complete freedom?

2018-04-03 Thread Arturas Mazeika
Hi David, Thanks a lot for the reply, the effort to update the documentation, and have the documentation reflect the question I posted here. I've read the doc you provided. I've read the updated parts and the the document as carefully as I could. I've browsed and skimmed part of the document (whe

Re: MatchMode in Dismax parser

2018-04-03 Thread lsharma3
Hi Shawn, My code is ready, I just need to raise the PR for the same, Can you please guide me to raise my First PR for the SOLR. Regards, Lucky Sharma -- Sent from: http://lucene.472066.n3.nabble.com/Solr-User-f472068.html

Re: MatchMode in Dismax parser

2018-04-03 Thread lsharma3
Hi Shawn, I have already made the changes for this, can you guid e me to raise my first PR :) . It would be a great help Regards, Lucky Sharma -- Sent from: http://lucene.472066.n3.nabble.com/Solr-User-f472068.html

RE: PreAnalyzed FieldType, and simultaneously importing JSON

2018-04-03 Thread Markus Jelsma
Hi David! Many thanks, this looks much better! Regards, Markus -Original message- > From:David Smiley > Sent: Monday 2nd April 2018 21:27 > To: solr-user@lucene.apache.org > Subject: Re: PreAnalyzed FieldType, and simultaneously importing JSON > > Hello Markus, > > It appears you are

Re: Trying to Restore older indexes in Solr7.2.1

2018-04-03 Thread Mugdha Varadkar
Hi Shawn Heisey, Thank you for the reply given here . Please find below answers to your questions, is the collection using the compositeId router? Yes collection us

Copy field on dynamic fields?

2018-04-03 Thread jatin roy
Hi, Can we create copy field on dynamic fields? If yes then how it decide which field should be copied to which one? For example: if I have dynamic field: category_* and while indexing 4 fields are formed such as: category_1 category_2 category_3 category_4 and now I have to copy the contents o