I do notice you are using hll (hyper-log-log) which is a distributed
cardinality *estimate* : https://en.wikipedia.org/wiki/HyperLogLog
-Yonik
On Fri, Nov 10, 2017 at 11:32 AM, kenny wrote:
> Hi all,
>
> We are doing some tests in solr 6.6 with json facet api and we get
> completely wrong count
Some of these questions should be directed to Hortonworks, but I'm glad you
posted them here because I noticed you asked similar questions on the IRC
channel but left before I could jump in and help. Full disclosure, I work
for Lucidworks and one of my jobs is managing the development team that
mak
Im looking through the slides from 2016 as well as the presentation again
from 2017
and in them there is a user interface for this project, that i dont see as
being available so im assuming it was created as a different project, would
be nice to have access to that. also, all of the examples in t
Kenny,
This is a known behavior in multi-sharded collection where the field values
belonging to same facet doesn't reside in same shard. Yonik Seeley has
improved the Json Facet feature by introducing "overrequest" and "refine"
parameters.
Kindly checkout Jira:
https://issues.apache.org/jira/brow
Hi all,
We are doing some tests in solr 6.6 with json facet api and we get
completely wrong counts for some combination of facets
Setting: We have a set of fields for 376k documents in our query (total
120M documents). We work with 2 shards. When doing first a faceting over
the first facet
Hi,
We have a HDP product cluster and are now planning to build a search
solution for some of our business requirements. In this regard, I have the
following questions. Can you please answer the below questions with respect
to Solr?
- As I understand, it is more performant to have SolrCloud se
This approach could work only if it is append only index. In case you have
updates/deletes, you have to process in order, otherwise you will get incorrect
results. I am thinking that is one of the reasons why it might not be supported
since not too useful.
Emir
--
Monitoring - Log Management -
Thanks Amrit,
I getting it know so can you please told me anyhow can I achieve using
composite routing ? as mentions my requirement below.
Because will need to send particular client data to particular shard.
Regards,
-Original Message-
From: Amrit Sarkar [mailto:sarkaramr...@gmail.c
Spellcheck works perfectly when I misspell a word, but if there is a word
that already exists in the dictionary, Solr still returns suggestions for
it. eg: bike gets spell corrected to bake.
I unfortunately cannot use the *maxResultsForSuggest* field as I need to
return the correct spelling irres
Promotion failure usually means your old gen doesn't have enough space to
accommodate the incoming (promoted from young to old) object. In your case,
you have specified NewRatio as 3, which means, you have approximately
30*(3/4) = 22.5 GB old gen heap space. If a heap this big gets fragmented,
it w
"In case you decide to use an entire new index for the
autosuggestion, you
can potentially manage that on your own"
This refers to the fact that is possible to define an index just for
autocompletion.
You can model the document as you prefer in this additional index, defining
the field types tha
I have a 3 server (debian) ensemble using zookeeper and solr 6.6.0 on aws
cloud. Setup has 3 shards per server with a replication factor of 3. It has
around 11 collections 2 of which are large having over 5 million records
each. Since i was maxing on the ram i tried to launch the servers with a
hig
Ketan,
here I have also created new field 'core' which value is any shard where I
> need to send documents and on retrieval use '_route_' parameter with
> mentioning the particular shard. But issue facing still my
> clusterstate.json showing the "router":{"name":"compositeId"} is it means
> my se
Hi Erik,
My requirement to index the documents of particular organization to specific
shard. Also I have made changes in core.properties as menions below.
Model Collection:
name=model
shard=shard1
collection=model
router.name=implicit
router.field=core
shards=shard1,shard2
Workset Collection:
n
Hi,
Can we index the Message-ID that is from the EML file into Solr?
Tika does have the Message-ID in the MIME data, but is Solr able to read it
and index it?
I'm using Solr 6.5.1.
Regards,
Edwin
Hi Erick,
Please ignore my earlier mail. I got it worked ! I missed the rule
attribute.
Now it is working.
Thanks,
Karan
On 10 November 2017 at 15:59, Karan Saini wrote:
> Hi Erick,
>
> Thanks for the help. It is working fine with the *KeywordTokenizerFactory.
> *Like you mentioned, i wan
Hi Erick,
Thanks for the help. It is working fine with the
*KeywordTokenizerFactory. *Like
you mentioned, i want to search for "dog" or "*dog*" alone also.
Case sensitivity is working fine, but i want to have the wild based search
also.
So I tried this changed code, but no luck !!
Thank you for your suggest!
about the NewRatio param. I found some 'promotion failed' in the gc log. It
trigger a stw gc instead of cms gc. If i change the NewRatio to 2,the promotion
fail maybe appeard more frequenly. Is the 'promotion failed' is caused by some
inappropriate use on solr cloud
Hi,
There are a couple of things based on your configuration I can suggest.
-XX:ConcGCThreads=4 - try removing this restriction on the threads.
-Xms30g - you should re-consider this param, as 30 GB is huge heap size.
Instead, in SolrCloud, try spawning multiple instances if you have system
resou
Hi Erick,
Some of the partial updates are taking huge time. Average QTime for updates
in 15 minute interval is 14344.
2017-11-10 08:15:11.863 INFO (qtp225493257-43961) [ x:collection]
o.a.s.c.S.Request [collection] webapp=/solr path=/update
params={wt=javabin&version=2} status=0 QTime=1007390
Hello,
as Amrit mentioned, I attached the schema.xml of such an index. Perhaps there
is something to find in it.
The responses of update and commit look quite normal:
{responseHeader={status=0,QTime=2}}
{responseHeader={status=0,QTime=14}}
After committing the fields
fld_BA1F56CD9C87419CB9A271D
21 matches
Mail list logo