date:20170427

Empty value fields not indexed

2017-04-27 Thread Zheng Lin Edwin Yeo

Hi, I'm using Solr 6.4.2, and I realized that for those fields which has no values, the field name is not index into Solr. It was working fine in the previous version. Any reason for this or any settings which needs to be done so that the field name can be indexed even though it's value is

Re: DIH Speed

2017-04-27 Thread Vijay Kokatnur

Let me clarify - DIH is running on Solr 6.5.0 that calls a different solr instance running on 4.5.0, which has 150M documents. If we try fetch them using DIH onto new solr cluster, wouldn't it result in deep paging on solr 4.5.0 and drastically slow down indexing on solr 6.5.0? On Thu, Apr

Re: Poll: Master-Slave or SolrCloud?

2017-04-27 Thread David Lee

As someone who moved from ES to Solr, I can say that one of the things that makes ES so much easier to configure is that the majority of things that need to be set for a specific environment are all in pretty much one config file. Also, I didn't have to deal with the "magic stuff" that many

Re: DIH Speed

2017-04-27 Thread Shawn Heisey

On 4/27/2017 9:15 PM, Vijay Kokatnur wrote: > Hey Shawn, Unfortunately, we can't upgrade the existing cluster. That > was my first approach as well. Yes, SolrEntityProcessor is used so it > results in deep paging after certain rows. I have observed that > instead of importing for a larger period,

Re: DIH Speed

2017-04-27 Thread Vijay Kokatnur

Hey Shawn, Unfortunately, we can't upgrade the existing cluster. That was my first approach as well. Yes, SolrEntityProcessor is used so it results in deep paging after certain rows. I have observed that instead of importing for a larger period, if data is imported only for 4 hours at a time,

Re: 1 main collection or multiple smaller collections?

2017-04-27 Thread Derek Poh

Richard Iam considering the sameoption asyour suggestion to put them in 1 single collection of products documents. A product doccontaining the supplier info. In this option, a supplier info will get repeated in eachof the supplier's product doc.I may be influenced by DB concepts. Guess it's a

Re: 1 main collection or multiple smaller collections?

2017-04-27 Thread Derek Poh

Hi Shawn 1 set of data is suppliers info and 1 set isthe suppliers products info. Usercan eitherdo a product search or a supplier search. 1 optionI am thinking of is to put them in 1 single collectionwith each product as a document. Each productdocument will have the supplier info in it.

Re: DIH Speed

2017-04-27 Thread Shawn Heisey

On 4/27/2017 5:40 PM, Erick Erickson wrote: > I'm unclear why DIH an deep paging are mixed. DIH is indexing and deep paging > is querying. > > If it's querying, consider cursorMark or the /export handler. >

Re: Atomic Updates

2017-04-27 Thread Erick Erickson

Been there, done that, got the t-shirt. Thanks for closing it out! Erick On Thu, Apr 27, 2017 at 10:29 AM, Chris Ulicny wrote: > While recreating it with a fresh schema, I realized that this was a case of > a very, very stupid user error during configuring the cores. > > I

Re: DIH Speed

2017-04-27 Thread Erick Erickson

I'm unclear why DIH an deep paging are mixed. DIH is indexing and deep paging is querying. If it's querying, consider cursorMark or the /export handler. https://lucidworks.com/2013/12/12/coming-soon-to-solr-efficient-cursor-based-iteration-of-large-result-sets/ If it's DIH, please explain a bit

Solr Query Performance benchmarking

2017-04-27 Thread Suresh Pendap

Hi, I am trying to perform Solr Query performance benchmarking and trying to measure the maximum throughput and latency that I can get from.a given Solr cluster. Following are my configurations Number of Solr Nodes: 4 Number of shards: 2 replication-factor: 2 Index size: 55 GB Shard/Core

DIH Speed

2017-04-27 Thread Vijay Kokatnur

We have a new solr 6.5.0 cluster, for which data is being imported via DIH from another Solr cluster running version 4.5.0. This question comes back to deep paging, but we have observed that after 30 minutes of querying the rate of processing goes down from 400/s to about 120/s. At that point it

TransactionLog doesn't know how to serialize class java.util.UUID; try implementing ObjectResolver?

2017-04-27 Thread Mahmoud Almokadem

Hello, When I try to update a document exists on solr cloud I got this message: TransactionLog doesn't know how to serialize class java.util.UUID; try implementing ObjectResolver? With the stack trace:

Re: Atomic Updates

2017-04-27 Thread Chris Ulicny

While recreating it with a fresh schema, I realized that this was a case of a very, very stupid user error during configuring the cores. I setup the testing cores with the wrong configset, and then proceeded to edit the schema in the right configset. So, the field was actually stored by default,

Re: Split Shard not working

2017-04-27 Thread Walter Underwood

What is the message in the log when it crashes? wunder Walter Underwood wun...@wunderwood.org http://observer.wunderwood.org/ (my blog) > On Apr 27, 2017, at 10:10 AM, Vijay Kokatnur wrote: > > We recently upgraded 4.5 index to 6.5 using IndexUpgrader. The index

Split Shard not working

2017-04-27 Thread Vijay Kokatnur

We recently upgraded 4.5 index to 6.5 using IndexUpgrader. The index size is around 600 GB on disk. When we try to split it using SPLITSHARD, it creates two new sub shards on the node and eventually crashes before completely the split. After restart, the original shard size if around 100 GB and

Re: 1 main collection or multiple smaller collections?

2017-04-27 Thread Walter Underwood

Design backwards from the search result pages (SRP). Make flat schema(s) with the fields you will search and display. One example is the schema I used at Netflix. I used one collection to hold movies, people (actors), and genres. There were collisions between the integer IDs, movies IDs were

Re: 1 main collection or multiple smaller collections?

2017-04-27 Thread Rick Leir

Does it make sense to use nested documents here? Products could be nested in a supplier document perhaps. Alternately, consider de-normalizing "til it hurts". A product doc might be able to contain supplier info. On April 27, 2017 8:50:59 AM EDT, Shawn Heisey wrote: >On

Re: size-estimator-lucene-solr.xls error in disk space estimator

2017-04-27 Thread Matteo Grolla

Right Alessandro that's another bug Cheers 2017-04-27 12:30 GMT+02:00 alessandro.benedetti : > +1 > I would add that what is called : "Avg. Document Size (KB)" seems more to > me > "Avg. Field Size (KB)". > Cheers > > > > - > --- > Alessandro Benedetti >

Re: Atomic Updates

2017-04-27 Thread Chris Ulicny

I'm sending commit=true with every update while testing. I'll write up the tests and see if someone else can reproduce it. On Thu, Apr 27, 2017 at 10:54 AM Erick Erickson wrote: > bq: but is there any possibility that the values stick around until > there is a segment

Re: Atomic Updates

2017-04-27 Thread Erick Erickson

bq: but is there any possibility that the values stick around until there is a segment merge for some strange reason There better not be or it's a bug. Things will stick around until you issue a commit, is there any chance that's the problem? If you can document the exact steps, maybe we can

Blocked ConcurrentUpdateSolrClient

2017-04-27 Thread Christian Belka

Hello I am trying to update larger amounts of Documents (mostly ADD/DELETE) through various threads. After a certain amount of time (a few hours) all my threads get stuck at taskExecutor-46" prio=5 tid=0x268 nid=0x10c BLOCKED owned by taskExecutor-9 Id=230 - stats: cpu=2788

Re: Indexing I/O errors and CorruptIndex messages

2017-04-27 Thread simon

Nope ... huge file system (600gb) only 50% full, and a complete index would be 80gb max. On Wed, Apr 26, 2017 at 4:04 PM, Erick Erickson wrote: > Disk space issue? Lucene requires at least as much free disk space as > your index size. Note that the disk full issue will

Re: Atomic Updates

2017-04-27 Thread Chris Ulicny

Yeah, something's not quite right somewhere. We never even considered in-place updates an option since it requires the fields to be non-indexed and non-stored. Our schemas never have any field that satisfies those two conditions let alone the other necessary ones. I went ahead and tested the

Re: Spatial Search: can not use FieldCache on a field which is neither indexed nor has doc values: latitudeLongitude_0_coordinate

2017-04-27 Thread freddy79

It does work with "solr.LatLonPointSpatialField" instead of "solr.LatLonType". But why not with "solr.LatLonType"? -- View this message in context:

Re: Update to Solr 6 - Amazon EC2 high CPU SYS usage

2017-04-27 Thread Shawn Heisey

On 4/27/2017 3:03 AM, Elodie Sannier wrote: > We have migrated from Solr 5.4.1 to Solr 6.4.0 on Amazon EC2 and we have > a high CPU SYS usage and it drastically decreases the Solr performance. > > The JVM version (java-1.8.0-openjdk-1.8.0.131-0.b11.el6_9.x86_64), the > Jetty version (9.3.14) and

Re: 1 main collection or multiple smaller collections?

2017-04-27 Thread Shawn Heisey

On 4/26/2017 11:57 PM, Derek Poh wrote: > There are some common fields between them. > At the source data end (database), the supplier info and product info > are updated separately. In this regard, I should separate them? > If it's In 1 single collection, when there are updatesto only the >

Spatial Search: can not use FieldCache on a field which is neither indexed nor has doc values: latitudeLongitude_0_coordinate

2017-04-27 Thread freddy79

Hi, when doing a query with spatial search i get the error: can not use FieldCache on a field which is neither indexed nor has doc values: latitudeLongitude_0_coordinate *SOLR Version:* 6.1.0 *schema.xml:* *Query:*

[ANNOUNCE] Apache Solr 6.5.1 released

2017-04-27 Thread jim ferenczi

27 April 2017, Apache Solr™ 6.5.1 available The Lucene PMC is pleased to announce the release of Apache Solr 6.5.1 Solr is the popular, blazing fast, open source NoSQL search platform from the Apache Lucene project. Its major features include powerful full-text search, hit highlighting,

[ANNOUNCE] Apache Solr 6.5.1 released

2017-04-27 Thread jim ferenczi

27 April 2017, Apache Solr™ 6.5.1 available The Lucene PMC is pleased to announce the release of Apache Solr 6.5.1 Solr is the popular, blazing fast, open source NoSQL search platform from the Apache Lucene project. Its major features include powerful full-text search, hit highlighting, faceted

[ANNOUNCE] Apache Solr 6.5.1 released

2017-04-27 Thread jim ferenczi

27 April 2017, Apache Solr™ 6.5.1 available The Lucene PMC is pleased to announce the release of Apache Solr 6.5.1 Solr is the popular, blazing fast, open source NoSQL search platform from the Apache Lucene project. Its major features include powerful full-text search, hit highlighting, faceted

Re: Help with facet.limit

2017-04-27 Thread alessandro.benedetti

In addition to what Erick mentioned, (if) you can use Json faceting and sort your facets according to your preferences using the stats integration [1]. Cheers [1] https://cwiki.apache.org/confluence/display/solr/Faceted+Search - --- Alessandro Benedetti Search Consultant, R

Re: counting_number_of_term_in_a_doc

2017-04-27 Thread alessandro.benedetti

I think the closest you get out of the box is the term vector component[1] . Cheers [1] https://cwiki.apache.org/confluence/display/solr/The+Term+Vector+Component - --- Alessandro Benedetti Search Consultant, R Software Engineer, Director Sease Ltd. - www.sease.io -- View this

Re: size-estimator-lucene-solr.xls error in disk space estimator

2017-04-27 Thread alessandro.benedetti

+1 I would add that what is called : "Avg. Document Size (KB)" seems more to me "Avg. Field Size (KB)". Cheers - --- Alessandro Benedetti Search Consultant, R Software Engineer, Director Sease Ltd. - www.sease.io -- View this message in context:

size-estimator-lucene-solr.xls error in disk space estimator

2017-04-27 Thread Matteo Grolla

It seems me that the estimation in MB is in fact an estimation in GB the formula includes the avg doc size, which is in kb, so the result is in kb and should be divided by 1024 to obtain the result in MB. But it's divided by 1024*1024

Update to Solr 6 - Amazon EC2 high CPU SYS usage

2017-04-27 Thread Elodie Sannier

Hello, We have migrated from Solr 5.4.1 to Solr 6.4.0 on Amazon EC2 and we have a high CPU SYS usage and it drastically decreases the Solr performance. The JVM version (java-1.8.0-openjdk-1.8.0.131-0.b11.el6_9.x86_64), the Jetty version (9.3.14) and the OS version (CentOS 6.9) have not changed

Re: Poll: Master-Slave or SolrCloud?

2017-04-27 Thread Emir Arnautovic

I think creating poll for ES ppl with question: "How do you run master nodes? A) on some data nodes B) dedicated node C) dedicated server" would give some insight how big issue is having ZK and if hiding ZK behind Solr would do any good. Emir On 25.04.2017 23:13, Otis Gospodnetić wrote: Hi

37 matches

Mail list logo