Hey everybody,
i have a question regarding query-request logging in solr-cloud. I've set the
the "org.apache.solr.core.SolrCore.Request"-logger to INFO-level and its
logging all those query-requests. So far so good. BUT, as I'm running Solr in
cloud mode with 3 nodes and 3 shards per collection
On 12/4/2017 12:11 PM, Steve Pruitt wrote:
Getting my Solr Cloud nodes up and running took manually setting execution
permissions on the configuration files and manually creating the logs and
logs/archived folders under /opt/solr/server. Even though I have my log
folders set to var/solr/logs
On 12/3/2017 9:27 AM, Mahmoud Almokadem wrote:
We're facing an issue related to the dataimporter status on new Admin UI
(7.0.1).
Calling to the API
http://solrip/solr/collection/dataimport?_=1512314812090&command=status&indent=on&wt=json
returns different status despite the importer is running
One more opinion on source field vs separate collections for multiple corpora.
Index statistics don’t really settle down until at least 100k documents. Below
that, idf is pretty noisy. With Ultraseek, we used pre-calculated frequency
data for collections under 10k docs.
If your corpora have sim
Thanks Eric. I have already followed the solrj indexing very closely - I have
to do a lot of manipulation at indexing time. The other blog article is very
interesting as I do indeed use "year" (year of publication) and it is very
frequently used to filter queries. I will have a play with that no
That's the unpleasant part of semi-structued documents (PDF, Word,
whatever). You never know the relationship between raw size and
indexable text.
Basically anything that you don't care to contribute to _scoring_ is
often better in an fq clause. You can also use {!cache=false} to
bypass actually u
Hi,
I wanted to check if it is a known issue that the merge metrics are not
exposed as JMX beans.
Any one else in the community ran into this issue?
Thanks
Suresh
On Sun, Dec 3, 2017 at 4:24 PM, suresh pendap
wrote:
> I see only these metrics in my Jconsole window
>
> [image: Inline image 1]
>
>You'll have a few economies of scale I think with a single core, but frankly I
>don't know if they'd be enough to measure. You say the docs are "quite large"
>though, >are you talking books? Magazine articles? is 20K large or are the 20M?
Technical reports. Sometimes up to 200MB pdfs, but that
At that scale, whatever you find administratively most convenient.
You'll have a few economies of scale I think with a single core, but
frankly I don't know if they'd be enough to measure. You say the docs
are "quite large" though, are you talking books? Magazine articles? is
20K large or are the 2
I have two different document stores that I want index. Both are quite small
(<50,000 documents though documents can be quite large). They are quite capable
of using the same schema, but you would not want to search both simultaneously.
I can see two approaches to handling this case.
1/ Create a
Not sure how it's possible. But I also tried using the _default config and just
adding in the source and target configuration to make sure I didn't have
something wonky in my custom solrconfig that was causing this issue. I can
confirm that until I restart the follower nodes, they will not recei
Have you tried: HtmlStripCharFilterFactory?
On Mon, Dec 4, 2017 at 12:37 PM, Fiz Newyorker wrote:
> Hello Solr Group,
>
> Good Morning !
>
> I am working on Solr 6.5 version and I am trying to Index from Mongo DB
> 3.2.5.
>
> I have content collection in mongodb where there is body column which h
Hello Solr Group,
Good Morning !
I am working on Solr 6.5 version and I am trying to Index from Mongo DB
3.2.5.
I have content collection in mongodb where there is body column which has
html tags in it.
I want to index body column with out html tags.
*Please see the below body column data in mo
On 12/4/2017 1:53 AM, Puppy Linux Distros wrote:
I know it's a bad practice but due to some reasons, our application fires
hard commits via code(upon most of the /update) and invokes the /update api
with commit=true and application very less uses softcommits. I will
recommend devs to look forward
Thank you Shaw for replying each items
I start to figure out better all these tricky jvm stuff.
Dominique
Le dim. 3 déc. 2017 à 01:30, Shawn Heisey a écrit :
> On 12/2/2017 8:43 AM, Dominique Bejean wrote:
> > I would like to have some advices on best practices related to Heap Size,
> > MMap,
Getting my Solr Cloud nodes up and running took manually setting execution
permissions on the configuration files and manually creating the logs and
logs/archived folders under /opt/solr/server. Even though I have my log
folders set to var/solr/logs in the default/solr.in.sh file.
After gettin
On Mon, Dec 4, 2017 at 1:35 PM, Shawn Heisey wrote:
> I'm pretty sure that the difference between docCount and maxDoc is deleted
> documents.
docCount (not the best name) here is the number of documents with the
field being searched. docFreq (df) is the number of documents
actually containing t
On 12/4/2017 9:54 AM, Steve Pruitt wrote:
I used the -u option to provide the installer with a user id. The /var/solr
folder has the user set as the owner. But, the /opt/solr folder is owned by
root. How did this happen?
When you install the service, Solr has no need to write to the progra
On 12/4/2017 7:21 AM, alessandro.benedetti wrote:
the reason docCount was improving things is because it was using a docCount
relative to a specific field while maxDoc is global all over the index ?
Lucene/Solr doesn't actually delete documents when you delete them, it
just marks them as delet
On 12/4/2017 7:33 AM, Steve Pruitt wrote:
I edited /etc/default/solr.in.sh to list my ZK hosts and I uncommented
ZK_CLIENT_TIMEOUT leaving the default value of 15000.
The default is 15 seconds, most of the example configs that Solr
includes have it increased to 30 seconds. IMHO, 15 seconds i
I think the below article explains it well:
http://blog.thetaphi.de/2012/07/use-lucenes-mmapdirectory-on-64bit.html
I was thinking that doc-Values need to be transitioned into JVM from the OS
cache.
Turns out that is not required as the docValues are loaded into the virtual
address space by the OS
Hi All - this same problem happened again, and I think I partially
understand what is going on. The part I don't know is what caused any
of the replicas to go into full recovery in the first place, but once
they do, they cause network interfaces on servers to go fully utilized
in both in/out d
I used the -u option to provide the installer with a user id. The /var/solr
folder has the user set as the owner. But, the /opt/solr folder is owned by
root. How did this happen?
I checked the opt/solr/bin/init.d/solr and verified RUNAS is set to the user I
entered. When I try to execute se
Neither commit does anything if no updates have been received.
But you don't need to wait for the devs to STOP DOING THAT ;). In
solrconfig.xml you can set:
IgnoreCommitOptimizeUpdateProcessorFactory
see the ref guide
Best,
Erick
On Mon, Dec 4, 2017 at 12:53 AM, Puppy Linux Distros wrote:
>
The documentation states you cannot run Solr cloud as root. When I installed
Solr I gave it another user. I checked the init.d script and RUNAS is set to
the user I entered. This user doesn't have the permissions I need, but I am
not exactly sure where to check permissions.
Thanks.
-S
I would like to stress how important is what Erick explained.
A lot of times people want to use the score to show it to the
users/calculate probability/doing weird calculations.
Score is used to rank results, given a query.
To give a local ordering.
This is the only useful information for the end
Will do thx
Am 04.12.2017 9:27 nachm. schrieb "Emir Arnautović" <
emir.arnauto...@sematext.com>:
> Hi Faraz,
> When you say query without sort, I assume that you mean you omit sort so
> you expect it to be sorted by score. It is expected to be slower than equal
> query without calculating score -
Furthermore, taking a look to the code for BM25 similarity, it seems to me it
is currently working right :
- docCount is used per field if != -1
/**
* Computes a score factor for a simple term and returns an explanation
* for that score factor.
*
*
* The default implementation us
Thanks.
I edited /etc/default/solr.in.sh to list my ZK hosts and I uncommented
ZK_CLIENT_TIMEOUT leaving the default value of 15000.
I am not sure if I need to set the SOLR_HOST. This is not a production
install, but I am running with three ZK machines and three Solr machines in the
cluster.
T
Hi Faraz,
When you say query without sort, I assume that you mean you omit sort so you
expect it to be sorted by score. It is expected to be slower than equal query
without calculating score - e.g. run same query as fq.
What you observe can be explained with:
* Solr is calculating score even not
Hi Markus,
just out of interest, why did
" It was solved back then by using docCount instead of maxDoc when
calculating idf, it worked really well!" solve the problem ?
i assume you are using different fields, one per language.
Each field is appearing on a different number of docs I guess.
e.g.
t
Hi guys,
Sorry to bother you again, but i am really confused:
Ive used solr admin website and created a query with lots of ORs using solr
4.7.
When i execute the query without a sort it executes in round about 3.5 - 4
seconds.
When i execute it with a sort on a field called pubdate it takes abou
Thanks. Very clear not to go with java 9.
On 2 December 2017 at 00:37, Shawn Heisey wrote:
> On 12/1/2017 12:32 PM, marotosg wrote:
> > Would you recommend installing Solr 6.6.1 with Java 9 for a production
> > environement?
>
> Solr 7.x has been tested with Java 9 and should work with no proble
Facing errors on using groups and facets
These queries work fine -
http://localhost:8983/solr/urls/select?q=*:*&rows=0&facet=true&facet.field=city_id
http://localhost:8983/solr/urls/select?q=*:*&rows=0&facet=true&facet.field=city_id&group=true&group.field=locality_id
These kind of queries (where
Hi,
Thanks Shawn for the help.
I think I should have added few more details to my previous mail.
I know it's a bad practice but due to some reasons, our application fires
hard commits via code(upon most of the /update) and invokes the /update api
with commit=true and application very less uses s
35 matches
Mail list logo