Hi,
I am considering SolrCloud for our applications but I have run into the
limitation of not being able to use Join Queries in distributed searches.
Our requirements are the following:
- SolrCloud will serve many applications where each application index
is separate from other application.
Aha! mlt=true, that was the key I hadn't worked out before (thought it was
qt=mlt that achieved that), things are looking rosy now, and these results
are a perfect fit for my needs. Thanks very much for your time to help
explain this!!
David
-Original Message-
From: Jack Krupansky
On 1/4/13 9:21 AM, Hassan wrote:
Hi,
I am considering SolrCloud for our applications but I have run into
the limitation of not being able to use Join Queries in distributed
searches.
Our requirements are the following:
- SolrCloud will serve many applications where each application
index is
We are starting a new e-com application from this month onwards, for which I
am trying to identify the right SOLR release. We were using 3.4 in our
previous project, bu I have read in multiple blogs and forums about the
improvements that SOLR 4 has in terms of efficient memory management, less
As someone in the forum correctly said, if all Solr releases were
evolutionary Solr 4.0 is revolutionary. It has lots of improvement over the
previous releases like NoSql features, atomic updates, cloud features and
lot more.
Solr 4.0 would be the right migration I believe.
Can someone in the
First, I'm assuming SolrCloud with Zookeeper etc.
1 Don't do anything. If Node A is the leader, the replica for that shard
will become the leader.
2 This is a little unclear. There are two cases, a the leader crashed or
b the replica crashed.
a no problem, distributed
3.6.2 is a maintenance release with bug fixes for existing 3.x users for
whom an upgrade to 4.0 is too big a leap at present. 4.0 is the release
that will see active development from here on in. If you ware starting
with a new project, 4.0 seems a reasonable place to start. I'd expect
4.1 to be
Solr does not support federated search in the form you describe - that
is, to make a query to Solr which solr defers to another search system.
There may be ways you could achieve it (Solr is pretty extensible) and
such a feature would be a very useful one, but it would take some,
likely
This is a good explanation and makes sense. The one inconsistency is referring
to a replica of a shard that has no replication. But its not that big of a
problem. If you wove the term 'core' into your writeup below it would be
complete and should be posted on the wiki.
Sent from my Verizon
I thought about adding Solr core, but it only muddies the water. Yes, it
needs to be added, but carefully.
In the context of SolrCloud, a Solr core is the underlying representation of
a replica. Alternatively, a replica of a shard of a collection is
implemented as a Solr core. [Need to factor
Yes. Thats it. Its clear if we separate logical terms from physical terms. A
simple cake diagram on the wiki along with perhaps a uml will solidify these
concepts.
Sent from my Verizon Wireless 4G LTE Smartphone
Original message
From: Jack Krupansky j...@basetechnology.com
On Fri, Jan 4, 2013 at 2:26 AM, Per Steffensen st...@designware.dk wrote:
Our biggest problem is that we really havent decided once and for all and
made sure to reflect the decision consistently across code and
documentation. As long as we havnt I believe it is still ok to change our
minds.
Agreed. But for completeness can it be node/collection/shard/replica/core?
Sent from my Verizon Wireless 4G LTE Smartphone
Original message
From: Yonik Seeley yo...@lucidworks.com
Date:
To: solr-user@lucene.apache.org
Subject: Re: Terminology question: Core vs. Collection
Actually. Node/collection/shard/replica/core/index
Sent from my Verizon Wireless 4G LTE Smartphone
Original message
From: darren dar...@ontrenet.com
Date:
To: yo...@lucidworks.com,solr-user@lucene.apache.org
Subject: Re: Terminology question: Core vs. Collection vs...
Hi,
I am trying to migrate to Solr 4 (from 3.6) for a
multithreaded/multicollection environment using the Solrj java client. I need
some clarification of when to use the
Cloud Solr Server vs LBHttpSolrServer. Any help is appreciated.
Which one do I use? The CloudSolrServer uses the LB server
CloudSolrServer can be used for indexing and is smart about indexing since it
knows the current cluster state.
For 4.0 I'd use one per collection because there is a bug around this fixed in
the upcoming 4.1 (using one for more than one collection).
In fact, if you are moving to 4, it's a good
Any release stimation date, Mark? I heard something about January. I was
considering using 4.0 for production but if 4.1 release is incomming I
could wait a little more.
2013/1/4 Mark Miller markrmil...@gmail.com
CloudSolrServer can be used for indexing and is smart about indexing since
it
Ok , thank you for the answer.
May be you can pointing me on documentation or any other source where can I
get the Idea how to develop such extension.
Thanks
Oleg.
On Fri, Jan 4, 2013 at 2:47 PM, Upayavira u...@odoko.co.uk wrote:
Solr does not support federated search in the form you describe
On 1/4/2013 8:54 AM, Luis Cappa Banda wrote:
Any release stimation date, Mark? I heard something about January. I was
considering using 4.0 for production but if 4.1 release is incomming I
could wait a little more.
I'm not a committer, but I contribute the occasional patch and keep an
eye on
thanks for pointing me to Solr's Zookeeper servlet. I will look at the
source to see how I can use to fulfill my needs.
Bill
On Thu, Jan 3, 2013 at 6:43 PM, Mark Miller markrmil...@gmail.com wrote:
Technically, you want to make sure zookeeper reports the node as live and
active.
You could
Thanks Mark.
-Original Message-
From: Mark Miller [mailto:markrmil...@gmail.com]
Sent: Friday, January 04, 2013 9:51 AM
To: solr-user@lucene.apache.org
Subject: Re: Solr 4 (CloudSolrServer and LBHttpSolrServer question)
CloudSolrServer can be used for indexing and is smart about
I'm going to push *hard* for a Jan release. Woe to those that get in my way :)
- Mark
On Jan 4, 2013, at 11:37 AM, Shawn Heisey s...@elyograg.org wrote:
On 1/4/2013 8:54 AM, Luis Cappa Banda wrote:
Any release stimation date, Mark? I heard something about January. I was
considering using 4.0
Hi Mark,
SOLR-3929 rocks!
A nigthly build of 4.1 with maxIndexingThreads configured to 24, takes
80% to 100% of the cpu resources :-)
Thank you, Otis and Gora
mpstat 10
CPU minf mjf xcal intr ithr csw icsw migr smtx srw syscl usr sys wt idl
00 0 13 607 241 234 78 100
Well, i hope this won't spoil everything then:
https://issues.apache.org/jira/browse/SOLR-4260
I'll continue tests monday
-Original message-
From:Mark Miller markrmil...@gmail.com
Sent: Fri 04-Jan-2013 17:54
To: solr-user@lucene.apache.org
Subject: Re: Solr 4 (CloudSolrServer and
Can I just start by saying that this was AMAZING. :-) When I asked the
question, I certainly did not expect this level of details.
And I vote on the cake diagram for WIKI as well. Perhaps, two with the
first one showing the trivial collapsed state of single
collection/shard/replica/core. The
The entire collection does have an index - a distributed index - which
consists of a Lucene index on each core/replica for the subset of the data
in that shard.
-- Jack Krupansky
-Original Message-
From: Alexandre Rafalovitch
Sent: Friday, January 04, 2013 1:12 PM
To:
My understanding is core is a logical solr term. Index is a physical lucene
term. A solr core is backed by a physical lucene index. One index per core.
Solr team can correct me if its not accurate. :)
Sent from my Verizon Wireless 4G LTE Smartphone
Original message
From:
Hmm. Doesn't that make (logical) index=collection? And (physical)
index=core? Which creates duplication of terminology and at the same time
can cause confusion between highest logical and lowest physical level.
Regards,
Alex.
P.s. Hoping not to start a new terminology war.
Personal blog:
On Fri, Jan 4, 2013 at 1:35 PM, Alexandre Rafalovitch
arafa...@gmail.com wrote:
Hmm. Doesn't that make (logical) index=collection? And (physical)
index=core? Which creates duplication of terminology and at the same time
can cause confusion between highest logical and lowest physical level.
I agree. In my opinion index is a low level lucene thing. I never say a
collection has an index directly. That confuses levels and creates confusion.
To me at least. I think the terminology discussed is good. Just some lingering
usage inconsistencies.
Sent from my Verizon Wireless 4G LTE
Using your terminology, I'd say core is a physical solr term, and index
is a pysical lucene term. A collection or a shard is a logical solr
term.
Upayavira
On Fri, Jan 4, 2013, at 06:28 PM, darren wrote:
My understanding is core is a logical solr term. Index is a physical
lucene term. A solr
Good point. Agree.
Sent from my Verizon Wireless 4G LTE Smartphone
Original message
From: Upayavira u...@odoko.co.uk
Date:
To: solr-user@lucene.apache.org
Subject: Re: Terminology question: Core vs. Collection vs...
Using your terminology, I'd say core is a physical
Currently a SolrCore is 1:1 with a low level Lucene index. There is no reason
that needs to alway be that way. It's possible that we may at some point add
built in micro sharding support that means a SolrCore could have multiple
underlying Lucene indexes. Or we may not.
- Mark
On Jan 4,
Yes. In that case, core should best be described as a logical solr
entity with various managed attributes
and qualities above the physical layer (sorry, not trying to perpetuate
this thread so much).
On 01/04/2013 01:55 PM, Mark Miller wrote:
Currently a SolrCore is 1:1 with a low level
It was a very good explanation, Jack!
I believe I have heard most of it before, so it is really not new for
me. I DO understand that the name replica and replication-factor CAN
be justified, but it requires a long and thorough explanation. And thats
the point. A good name for a concept means
We're not gonna have documentation to explain it. I guess it is more a
question of starting a discussion here about how to do it.
My thought would be to write an adapter in front of your APIs to make it
look like a Solr instance, and fake distributed search. But, to get that
to work, you'd need
Would this be a reasonable (if very rough) attempt at cake diagram?
https://docs.google.com/drawings/d/1XxLjds0OOm44zOVCMR-cwCJXnTs3C2x257KpCTxI1Ec/edit
Not sure if I managed to get logical/physical separation clearly enough,
but it could be a start.
Regards,
Alex.
Personal blog:
On Jan 4, 2013, at 2:14 PM, Per Steffensen st...@designware.dk wrote:
I'm not sure what the node tells Zookeeper and who does shard assignment. I
mean, does a node explicitly say what shard it wants to be, or is that
assigned by Zookeeper, or is that a node's choice/option?
It's basically
Hi,
If you don't need to shard your index and don't need NRT search Solr 3.x is
much simpler to operate and is more mature.
Otis
Solr ElasticSearch Support
http://sematext.com/
On Jan 4, 2013 7:08 AM, Dikchant Sahi contacts...@gmail.com wrote:
As someone in the forum correctly said, if all
Yes , it would be great to start discussion of this topic.
I am looking a sort of kick start information to get start more detailed
investigation. And of course may be someone already faced with this problem
so please share your ideas and experience.
Thanks
Oleg.
On Fri, Jan 4, 2013 at 2:15 PM,
I agree with the 'more mature' analysis, but surely you can use 4.0 in a
3.x style without greater difficulty, no?
Upayavira
On Fri, Jan 4, 2013, at 07:35 PM, Otis Gospodnetic wrote:
Hi,
If you don't need to shard your index and don't need NRT search Solr 3.x
is
much simpler to operate and
Sounds like you may have a corrupt index. Try running the CheckIndex tool.
Otis
Solr ElasticSearch Support
http://sematext.com/
On Jan 3, 2013 8:59 AM, Karan jindal karanjindal1...@gmail.com wrote:
Hi everyone,
I have a solr index which is built using solr 3.2.
I am facing two problem with
I think the problem is that you have to interpret the user query (Solr has
one syntax, other sources have a different one) and then combine results
(how?). All of those are non-trivial.
Have you looked at something like
http://www.comcepta.com/en/enterprise-metasearch.html which builds on top
of
Sachin,
You might more response on this list is you can describe a little in detail
what your application needs to do. A lot of us haven't used Endeca and won't
understand exactly what you mean here.
With that said, I migrated a few apps from Endeca to Solr a few years back and
will try to
On Jan 4, 2013, at 3:41 PM, Dyer, James james.d...@ingramcontent.com wrote:
4. Dynamic Business Rules.
There is an open JIRA issue around biz rules and drools integration. Not sure
if there is any work done there, but at least some notes about it last I looked.
- Mark
Hello Solr-Users,
I thought you, or someone you know, might be interested in a very important
role here at Simply Hired. The Staff Search Engineer will own the
responsibility of writing the search engine of SimplyHired. You will work
on cutting edge machine learning, search and big data
Hi All,
I am getting exceptions on trying to create a collection. Any help is
appreciated.
While trying to create a collection, I got this error
Caused by: org.apache.solr.client.solrj.SolrServerException: No live
SolrServers available to handle this request
at
Hi,
I think what you are seeing is a general thing. Regular search is slower
while there is indexing, too, of course.
So maybe it's best to mentally decouple indexing part here and simply make
your calls as fast as possible without indexing. Then you can add indexing
and play with things like
For the second one:
Wrong version of library on a classpath or multiple versions of library on
the classpath which causes wrong classes with missing fields/variables? Or
library interface baked in and the implementation is newer. Some sort of
mismatch basically. Most probably in Apache http
Thanks. I guess you're right - it's normal behaviour. Are there some
guidelines how to use ramBufferSizeMB or only by testing ? Do you know if
DIH is gentler than indexing via REST or solrj API ?
Kind regards.
On 4 January 2013 23:14, Otis Gospodnetic otis.gospodne...@gmail.comwrote:
Hi,
I
Thanks! I had a different version of httpclient in the classpath. So the 2nd
exception is gone but now I am back to the first one
org.apache.solr.client.solrj.SolrServerException: No live SolrServers
available to handle this request
-Original Message-
From: Alexandre Rafalovitch
DIH won't make any real difference, I'd say. The work to write terms to
your index still happens in either case.
Upayavira
On Fri, Jan 4, 2013, at 11:25 PM, Marcin Rzewucki wrote:
Thanks. I guess you're right - it's normal behaviour. Are there some
guidelines how to use ramBufferSizeMB or only
Tried Wireshark yet to see what host/port it is trying to connect and why
it fails? It is a complex tool, but well worth learning.
Regards,
Alex.
Personal blog: http://blog.outerthoughts.com/
LinkedIn: http://www.linkedin.com/in/alexandrerafalovitch
- Time is the quality of nature that keeps
That's probably as official as anything ever gets around here.
-- Jack Krupansky
-Original Message-
From: Mark Miller
Sent: Friday, January 04, 2013 11:47 AM
To: solr-user@lucene.apache.org
Subject: Re: Solr 4 (CloudSolrServer and LBHttpSolrServer question)
I'm going to push *hard*
If you index from the outside (i.e. not using DIH) you have more control:
* how many threads you use
* how you batch documents
* how much you wait between indexing batches
...
Otis
--
Solr ElasticSearch Support
http://sematext.com/
On Fri, Jan 4, 2013 at 6:25 PM, Marcin Rzewucki
Not at this time. That is something you would do at your app level -
re-query with a looser query if zero results for the original query.
-- Jack Krupansky
-Original Message-
From: Varun Thacker
Sent: Friday, January 04, 2013 7:50 AM
To: solr-user@lucene.apache.org
Subject: Removing
Hi Varun,
I don't think this exists in Solr...
But have a look at http://sematext.com/products/dym-researcher/index.html .
Look at the screenshot and you will spot something labeled as Relaxer in
the blue area. This (Query) Relaxer is DYM ReSearcher's cousin and can be
seen in action on
Hi,
I think things will work for Hassan as he described them. The key is not
to shard in his case, that's all.
Hassan, yes, 1-2M docs is small. But beware of creating a crazy
number (e.g. thousands) of collections per server, as each collection has
some cost.
Otis
--
Solr ElasticSearch
Hi Erick,
The issue was with zookeeper when we tried to force full replication by
cleaning the datadir in zookeeper, caused the index removal.
Our index always replicated full even on short outage or restart. I think
too far out of date could be the reason. We felt zookeeper was to blame
here.
59 matches
Mail list logo