Hi,
I'm using Solr 6.4.2, and I realized that for those fields which has no
values, the field name is not index into Solr.
It was working fine in the previous version.
Any reason for this or any settings which needs to be done so that the
field name can be indexed even though it's value is
Let me clarify -
DIH is running on Solr 6.5.0 that calls a different solr instance running
on 4.5.0, which has 150M documents. If we try fetch them using DIH onto
new solr cluster, wouldn't it result in deep paging on solr 4.5.0 and
drastically slow down indexing on solr 6.5.0?
On Thu, Apr
As someone who moved from ES to Solr, I can say that one of the things
that makes ES so much easier to configure is that the majority of things
that need to be set for a specific environment are all in pretty much
one config file. Also, I didn't have to deal with the "magic stuff" that
many
On 4/27/2017 9:15 PM, Vijay Kokatnur wrote:
> Hey Shawn, Unfortunately, we can't upgrade the existing cluster. That
> was my first approach as well. Yes, SolrEntityProcessor is used so it
> results in deep paging after certain rows. I have observed that
> instead of importing for a larger period,
Hey Shawn,
Unfortunately, we can't upgrade the existing cluster. That was my first
approach as well.
Yes, SolrEntityProcessor is used so it results in deep paging after certain
rows.
I have observed that instead of importing for a larger period, if data is
imported only for 4 hours at a time,
Richard
Iam considering the sameoption asyour suggestion to put them in 1 single
collection of products documents. A product doccontaining the supplier info.
In this option, a supplier info will get repeated in eachof the
supplier's product doc.I may be influenced by DB concepts. Guess it's a
Hi Shawn
1 set of data is suppliers info and 1 set isthe suppliers products info.
Usercan eitherdo a product search or a supplier search.
1 optionI am thinking of is to put them in 1 single collectionwith each
product as a document. Each productdocument will have the supplier info
in it.
On 4/27/2017 5:40 PM, Erick Erickson wrote:
> I'm unclear why DIH an deep paging are mixed. DIH is indexing and deep paging
> is querying.
>
> If it's querying, consider cursorMark or the /export handler.
>
Been there, done that, got the t-shirt. Thanks for closing it out!
Erick
On Thu, Apr 27, 2017 at 10:29 AM, Chris Ulicny wrote:
> While recreating it with a fresh schema, I realized that this was a case of
> a very, very stupid user error during configuring the cores.
>
> I
I'm unclear why DIH an deep paging are mixed. DIH is
indexing and deep paging is querying.
If it's querying, consider cursorMark or the /export handler.
https://lucidworks.com/2013/12/12/coming-soon-to-solr-efficient-cursor-based-iteration-of-large-result-sets/
If it's DIH, please explain a bit
Hi,
I am trying to perform Solr Query performance benchmarking and trying to
measure the maximum throughput and latency that I can get from.a given Solr
cluster.
Following are my configurations
Number of Solr Nodes: 4
Number of shards: 2
replication-factor: 2
Index size: 55 GB
Shard/Core
We have a new solr 6.5.0 cluster, for which data is being imported via DIH
from another Solr cluster running version 4.5.0.
This question comes back to deep paging, but we have observed that after 30
minutes of querying the rate of processing goes down from 400/s to about
120/s. At that point it
Hello,
When I try to update a document exists on solr cloud I got this message:
TransactionLog doesn't know how to serialize class java.util.UUID; try
implementing ObjectResolver?
With the stack trace:
While recreating it with a fresh schema, I realized that this was a case of
a very, very stupid user error during configuring the cores.
I setup the testing cores with the wrong configset, and then proceeded to
edit the schema in the right configset. So, the field was actually stored
by default,
What is the message in the log when it crashes?
wunder
Walter Underwood
wun...@wunderwood.org
http://observer.wunderwood.org/ (my blog)
> On Apr 27, 2017, at 10:10 AM, Vijay Kokatnur wrote:
>
> We recently upgraded 4.5 index to 6.5 using IndexUpgrader. The index
We recently upgraded 4.5 index to 6.5 using IndexUpgrader. The index size
is around 600 GB on disk. When we try to split it using SPLITSHARD, it
creates two new sub shards on the node and eventually crashes before
completely the split. After restart, the original shard size if around 100
GB and
Design backwards from the search result pages (SRP). Make flat schema(s) with
the fields you will search and display.
One example is the schema I used at Netflix. I used one collection to hold
movies, people (actors), and genres. There were collisions between the integer
IDs, movies IDs were
Does it make sense to use nested documents here? Products could be nested in a
supplier document perhaps.
Alternately, consider de-normalizing "til it hurts". A product doc might be
able to contain supplier info.
On April 27, 2017 8:50:59 AM EDT, Shawn Heisey wrote:
>On
Right Alessandro that's another bug
Cheers
2017-04-27 12:30 GMT+02:00 alessandro.benedetti :
> +1
> I would add that what is called : "Avg. Document Size (KB)" seems more to
> me
> "Avg. Field Size (KB)".
> Cheers
>
>
>
> -
> ---
> Alessandro Benedetti
>
I'm sending commit=true with every update while testing. I'll write up the
tests and see if someone else can reproduce it.
On Thu, Apr 27, 2017 at 10:54 AM Erick Erickson
wrote:
> bq: but is there any possibility that the values stick around until
> there is a segment
bq: but is there any possibility that the values stick around until
there is a segment merge for some strange reason
There better not be or it's a bug. Things will stick around until
you issue a commit, is there any chance that's the problem?
If you can document the exact steps, maybe we can
Hello
I am trying to update larger amounts of Documents (mostly ADD/DELETE) through
various threads.
After a certain amount of time (a few hours) all my threads get stuck at
taskExecutor-46" prio=5 tid=0x268 nid=0x10c BLOCKED owned by
taskExecutor-9 Id=230 - stats: cpu=2788
Nope ... huge file system (600gb) only 50% full, and a complete index would
be 80gb max.
On Wed, Apr 26, 2017 at 4:04 PM, Erick Erickson
wrote:
> Disk space issue? Lucene requires at least as much free disk space as
> your index size. Note that the disk full issue will
Yeah, something's not quite right somewhere. We never even considered
in-place updates an option since it requires the fields to be non-indexed
and non-stored. Our schemas never have any field that satisfies those two
conditions let alone the other necessary ones.
I went ahead and tested the
It does work with "solr.LatLonPointSpatialField" instead of
"solr.LatLonType".
But why not with "solr.LatLonType"?
--
View this message in context:
On 4/27/2017 3:03 AM, Elodie Sannier wrote:
> We have migrated from Solr 5.4.1 to Solr 6.4.0 on Amazon EC2 and we have
> a high CPU SYS usage and it drastically decreases the Solr performance.
>
> The JVM version (java-1.8.0-openjdk-1.8.0.131-0.b11.el6_9.x86_64), the
> Jetty version (9.3.14) and
On 4/26/2017 11:57 PM, Derek Poh wrote:
> There are some common fields between them.
> At the source data end (database), the supplier info and product info
> are updated separately. In this regard, I should separate them?
> If it's In 1 single collection, when there are updatesto only the
>
Hi,
when doing a query with spatial search i get the error: can not use
FieldCache on a field which is neither indexed nor has doc values:
latitudeLongitude_0_coordinate
*SOLR Version:* 6.1.0
*schema.xml:*
*Query:*
27 April 2017, Apache Solr™ 6.5.1 available
The Lucene PMC is pleased to announce the release of Apache Solr 6.5.1
Solr is the popular, blazing fast, open source NoSQL search platform from
the Apache Lucene project. Its major features include powerful full-text
search, hit highlighting,
27 April 2017, Apache Solr™ 6.5.1 available
The Lucene PMC is pleased to announce the release of Apache Solr 6.5.1
Solr is the popular, blazing fast, open source NoSQL search platform from
the Apache Lucene project. Its major features include powerful full-text
search, hit highlighting, faceted
27 April 2017, Apache Solr™ 6.5.1 available
The Lucene PMC is pleased to announce the release of Apache Solr 6.5.1
Solr is the popular, blazing fast, open source NoSQL search platform from
the Apache Lucene project. Its major features include powerful full-text
search, hit highlighting, faceted
In addition to what Erick mentioned, (if) you can use Json faceting and sort
your facets according to your preferences using the stats integration [1].
Cheers
[1] https://cwiki.apache.org/confluence/display/solr/Faceted+Search
-
---
Alessandro Benedetti
Search Consultant, R
I think the closest you get out of the box is the term vector component[1] .
Cheers
[1]
https://cwiki.apache.org/confluence/display/solr/The+Term+Vector+Component
-
---
Alessandro Benedetti
Search Consultant, R Software Engineer, Director
Sease Ltd. - www.sease.io
--
View this
+1
I would add that what is called : "Avg. Document Size (KB)" seems more to me
"Avg. Field Size (KB)".
Cheers
-
---
Alessandro Benedetti
Search Consultant, R Software Engineer, Director
Sease Ltd. - www.sease.io
--
View this message in context:
It seems me that the estimation in MB is in fact an estimation in GB
the formula includes the avg doc size, which is in kb, so the result is in
kb and should be divided by 1024 to obtain the result in MB.
But it's divided by 1024*1024
Hello,
We have migrated from Solr 5.4.1 to Solr 6.4.0 on Amazon EC2 and we have
a high CPU SYS usage and it drastically decreases the Solr performance.
The JVM version (java-1.8.0-openjdk-1.8.0.131-0.b11.el6_9.x86_64), the
Jetty version (9.3.14) and the OS version (CentOS 6.9) have not changed
I think creating poll for ES ppl with question: "How do you run master
nodes? A) on some data nodes B) dedicated node C) dedicated server"
would give some insight how big issue is having ZK and if hiding ZK
behind Solr would do any good.
Emir
On 25.04.2017 23:13, Otis Gospodnetić wrote:
Hi
37 matches
Mail list logo