Re: Congratulations to the new Apache Solr PMC Chair, Jan Høydahl!

2021-02-19 Thread Atita Arora
Congratulations Jan!

On Fri, Feb 19, 2021 at 9:41 AM Dawid Weiss  wrote:

> Congratulations, Jan!
>
> On Thu, Feb 18, 2021 at 7:56 PM Anshum Gupta 
> wrote:
> >
> > Hi everyone,
> >
> > I’d like to inform everyone that the newly formed Apache Solr PMC
> nominated and elected Jan Høydahl for the position of the Solr PMC Chair
> and Vice President. This decision was approved by the board in its February
> 2021 meeting.
> >
> > Congratulations Jan!
> >
> > --
> > Anshum Gupta
>


Re: Is the lucene.apache.org link dead?

2021-02-01 Thread Atita Arora
True the link is down since last week, I checked as we are currently in the
state of migration to 8.7 too.


On Mon, Feb 1, 2021 at 6:57 AM Taisuke Miyazaki 
wrote:

> Hi,
> I tried to open the Solr News page to check the contents of the solr
> release, but it seems to get Not Found.
> I think it's either the wrong link or the link is messed up.
> If there is a problem, do you think you can fix it?
>
> Sorry if this has already been discussed somewhere.
>
> Solr News Page: https://lucene.apache.org/solr/news.html
> Dead LInk: https://lucene.apache.org/solr/8_7_0/changes/Changes.html
>
> Thank you.
> Taisuke.
>


Re: Config files not replicating

2020-06-30 Thread Atita Arora
Yes, The config is there and it works for me in live environment but not
the new staging environment.


On Tue, Jun 30, 2020 at 2:29 PM Erick Erickson 
wrote:

> Did you put your auxiliary files in the
> confFiles tag? E.g. from the page you referenced:
>
> schema.xml,stopwords.txt,elevate.xml
>
> Best,
> Erick
>
> > On Jun 30, 2020, at 5:38 AM, Atita Arora  wrote:
> >
> > Hi,
> >
> > We are using Solr 6.6.2 in the Master-Slave mode ( hot star of the
> > discussion thread these days !!) and lately, I got into this weird issue
> > that at each replication trigger my index gets correctly replicated but
> my
> > config changes are not replicated to my slaves.
> >
> > We are using referential properties i.e. my solrconfig.xml imports the
> > different configs like requesthandler_config.xml,
> > replication_handler_config.xml, etc  which essentially means if going by
> > solr doc (
> https://lucene.apache.org/solr/guide/6_6/index-replication.html) :
> >
> > Unlike the index files, where the timestamp is good enough to figure out
> if
> > they are identical, configuration files are compared against their
> > checksum. The schema.xml files (on master and slave) are judged to be
> > identical if their checksums are identical.
> >
> > The checksum of my solrconfig.xml would not vary, is it why my files
> won't
> > replicate?
> >
> > I already have another Master-Slave in a different environment working
> with
> > the same config version, so I don't smell any issue with the replication
> > configuration.
> >
> > I have tried manual replication too but the files would not change.
> > Maybe it is something weirdly trivial or stupid that I seem to be missing
> > here, any pointers or ideas what else can I check?
> >
> > Thank you,
> >
> > Atita
>
>


Config files not replicating

2020-06-30 Thread Atita Arora
Hi,

We are using Solr 6.6.2 in the Master-Slave mode ( hot star of the
discussion thread these days !!) and lately, I got into this weird issue
that at each replication trigger my index gets correctly replicated but my
config changes are not replicated to my slaves.

We are using referential properties i.e. my solrconfig.xml imports the
different configs like requesthandler_config.xml,
replication_handler_config.xml, etc  which essentially means if going by
solr doc (https://lucene.apache.org/solr/guide/6_6/index-replication.html) :

Unlike the index files, where the timestamp is good enough to figure out if
they are identical, configuration files are compared against their
checksum. The schema.xml files (on master and slave) are judged to be
identical if their checksums are identical.

The checksum of my solrconfig.xml would not vary, is it why my files won't
replicate?

I already have another Master-Slave in a different environment working with
the same config version, so I don't smell any issue with the replication
configuration.

I have tried manual replication too but the files would not change.
Maybe it is something weirdly trivial or stupid that I seem to be missing
here, any pointers or ideas what else can I check?

Thank you,

Atita


Re: [EXTERNAL] Getting rid of Master/Slave nomenclature in Solr

2020-06-19 Thread Atita Arora
I see so many topics being discussed in this thread and I literary got lost
somewhere , but was just thinking can we call it Parent -Child
architecture, m sure no one will raise an objection there.

Although, looking at comments above I still feel it would be a bigger
effort to convince everyone than making a change. ;)

On Fri, 19 Jun 2020, 17:21 Mark H. Wood,  wrote:

> On Fri, Jun 19, 2020 at 09:22:49AM -0400, j.s. wrote:
> > On 6/18/20 9:50 PM, Rahul Goswami wrote:
> > > So +1 on "slave" being the problematic term IMO, not "master".
> >
> > but you cannot have a master without a slave, n'est-ce pas?
>
> Well, yes.  In education:  Master of Science, Arts, etc.  In law:
> Special Master (basically a judge's delegate).  See also "magistrate."
> None of these has any connotation of the ownership of one person by
> another.
>
> (It's a one-way relationship:  there is no slavery without mastery,
> but there are other kinds of mastery.)
>
> But this is an emotional issue, not a logical one.  If doing X makes
> people angry, and we don't want to make those people angry, then
> perhaps we should not do X.
>
> > i think it is better to use the metaphor of copying rather than one of
> > hierarchy. language has so many (unintended) consequences ...
>
> Sensible.
>
> --
> Mark H. Wood
> Lead Technology Analyst
>
> University Library
> Indiana University - Purdue University Indianapolis
> 755 W. Michigan Street
> Indianapolis, IN 46202
> 317-274-0749
> www.ulib.iupui.edu
>


Re: Getting rid of Master/Slave nomenclature in Solr

2020-06-18 Thread Atita Arora
+1 Noble and Ilan !!



On Thu, Jun 18, 2020 at 7:51 AM Noble Paul  wrote:

> Looking at the code I see a 692 occurrences of the word "slave".
> Mostly variable names and ref guide docs.
>
> The word "slave" is present in the responses as well. Any change in
> the request param/response payload is backward incompatible.
>
> I have no objection to changing the names in ref guide and other
> internal variables. Going ahead with backward incompatible changes is
> painful. If somebody has the appetite to take it up, it's OK
>
> If we must change, master/follower can be a good enough option.
>
> master (noun): A man in charge of an organization or group.
> master(adj) : having or showing very great skill or proficiency.
> master(verb): acquire complete knowledge or skill in (a subject,
> technique, or art).
> master (verb): gain control of; overcome.
>
> I hope nobody has a problem with the term "master"
>
> On Thu, Jun 18, 2020 at 3:19 PM Ilan Ginzburg  wrote:
> >
> > Would master/follower work?
> >
> > Half the rename work while still getting rid of the slavery
> connotation...
> >
> >
> > On Thu 18 Jun 2020 at 07:13, Walter Underwood 
> wrote:
> >
> > > > On Jun 17, 2020, at 4:00 PM, Shawn Heisey 
> wrote:
> > > >
> > > > It has been interesting watching this discussion play out on multiple
> > > open source mailing lists.  On other projects, I have seen a VERY high
> > > level of resistance to these changes, which I find disturbing and
> > > surprising.
> > >
> > > Yes, it is nice to see everyone just pitch in and do it on this list.
> > >
> > > wunder
> > > Walter Underwood
> > > wun...@wunderwood.org
> > > http://observer.wunderwood.org/  (my blog)
> > >
> > >
>
>
>
> --
> -
> Noble Paul
>


Re: Getting rid of Master/Slave nomenclature in Solr

2020-06-17 Thread Atita Arora
I agree avoiding using of solr cloud terminology too.

I may suggest going for "prime" and "clone"
(Short and precise as Master and Slave).

Best,
Atita





On Wed, 17 Jun 2020, 22:50 Walter Underwood,  wrote:

> I strongly disagree with using the Solr Cloud leader/follower terminology
> for non-Cloud clusters. People in my company are confused enough without
> using polysemous terminology.
>
> “This node is the leader, but it means something different than the leader
> in this other cluster.” I’m dreading that conversation.
>
> I like “principal”. How about “clone” for the slave role? That suggests
> that
> it does not accept updates and that it is loosely-coupled, only depending
> on the state of the no-longer-called-master.
>
> Chegg has five production Solr Cloud clusters and one production
> master/slave
> cluster, so this is not a hypothetical for us. We have 100+ Solr hosts in
> production.
>
> wunder
> Walter Underwood
> wun...@wunderwood.org
> http://observer.wunderwood.org/  (my blog)
>
> > On Jun 17, 2020, at 1:36 PM, Trey Grainger  wrote:
> >
> > Proposal:
> > "A Solr COLLECTION is composed of one or more SHARDS, which each have one
> > or more REPLICAS. Each replica can have a ROLE of either:
> > 1) A LEADER, which can process external updates for the shard
> > 2) A FOLLOWER, which receives updates from another replica"
> >
> > (Note: I prefer "role" but if others think it's too overloaded due to the
> > overseer role, we could replace it with "mode" or something similar)
> > ---
> >
> > To be explicit with the above definitions:
> > 1) In SolrCloud, the roles of leaders and followers can dynamically
> change
> > based upon the status of the cluster. In standalone mode, they can be
> > changed by manual intervention.
> > 2) A leader does not have to have any followers (i.e. only one active
> > replica)
> > 3) Each shard always has one leader.
> > 4) A follower can also pull updates from another follower instead of a
> > leader (traditionally known as a REPEATER). A repeater is still a
> follower,
> > but would not be considered a leader because it can't process external
> > updates.
> > 5) A replica cannot be both a leader and a follower.
> >
> > In addition to the above roles, each replica can have a TYPE of one of:
> > 1) NRT - which can serve in the role of leader or follower
> > 2) TLOG - which can only serve in the role of follower
> > 3) PULL - which can only serve in the role of follower
> >
> > A replica's type may be changed automatically in the event that its role
> > changes.
> >
> > I think this terminology is consistent with the current Leader/Follower
> > usage while also being able to easily accomodate a rename of the
> historical
> > master/slave terminology without mental gymnastics or the introduction or
> > more cognitive load through new terminology. I think adopting the
> > Primary/Replica terminology will be incredibly confusing given the
> already
> > specific and well established meaning of "replica" within Solr.
> >
> > All the Best,
> >
> > Trey Grainger
> > Founder, Searchkernel
> > https://searchkernel.com
> >
> >
> >
> > On Wed, Jun 17, 2020 at 3:38 PM Anshum Gupta 
> wrote:
> >
> >> Hi everyone,
> >>
> >> Moving a conversation that was happening on the PMC list to the public
> >> forum. Most of the following is just me recapping the conversation that
> has
> >> happened so far.
> >>
> >> Some members of the community have been discussing getting rid of the
> >> master/slave nomenclature from Solr.
> >>
> >> While this may require a non-trivial effort, a general consensus so far
> >> seems to be to start this process and switch over incrementally, if a
> >> single change ends up being too big.
> >>
> >> There have been a lot of suggestions around what the new nomenclature
> might
> >> look like, a few people don’t want to overlap the naming here with what
> >> already exists in SolrCloud i.e. leader/follower.
> >>
> >> Primary/Replica was an option that was suggested based on what other
> >> vendors are moving towards based on Wikipedia:
> >> https://en.wikipedia.org/wiki/Master/slave_(technology)
> >> , however there were concerns around the use of “replica” as that
> denotes a
> >> very specific concept in SolrCloud. Current terminology clearly
> >> differentiates the use of the traditional replication model from
> SolrCloud
> >> and reusing the names would make it difficult for that to happen.
> >>
> >> There were similar concerns around using Leader/follower.
> >>
> >> Let’s continue this conversation here while making sure that we converge
> >> without much bike-shedding.
> >>
> >> -Anshum
> >>
>
>


Re: Minimum Match Query

2020-05-06 Thread Atita Arora
Hi,

Did you happen to look into :

https://lucene.apache.org/solr/guide/6_6/the-dismax-query-parser.html#TheDisMaxQueryParser-Themm_MinimumShouldMatch_Parameter

I believe 6.5.1 has it too.

I hope it should help.


On Wed, May 6, 2020 at 6:46 PM Russell Bahr  wrote:

> Hi SOLR team,
> I have been asked if there is a way to return results only if those
> results match a minimum number of times present in the query.
> ( queries looking for a minimum amount of mentions for a particular
> term/phrase. Ie must be mentioned 'x' amount of times to return results).
> Is this something that is possible using SOLR 6.5.1?  Is this something
> that would require a newer version of SOLR?
> Any help on this would be appreciated.
> Thank you,
> Russ
>


Re: Using QT param with /select

2020-03-10 Thread Atita Arora
Hi,
Thanks for looping back in.
We use Master-Slave!
I resolved this one with handleSelect=true , removing /select handler from
config and created another requesthandler (with a different name - other
than select) and marking that as default="true".
Now all the requests with qt param are handled by this request handler
while the one specifying qt goes to the designated handlers.
The application I am working on has different request handlers - like for
the different types of suggestions, spellchecking, default search request
etc and the code leverage solrj with qt param to dispatch requests to
different handlers.


I hope that was the right way to go about it.


On Tue, Mar 10, 2020 at 8:26 AM Mikhail Khludnev  wrote:

> Hello, Atita.
>
> My question here is that on Solr 6.2.6 to enable using 'qt' param I need to
> > do handleSelect=false
>
> Can you elaborate on that? What exactly happens?  Also, please clarify
> whether you use SolrCloud or standalone?
>
>
> On Mon, Mar 2, 2020 at 7:37 PM Atita Arora  wrote:
>
> > Hi,
> >
> > I am working on improving the search app which is using 'qt' param
> heavily
> > to redirect requests to different handlers based on the parameters as
> > provided by the user.
> >
> > Also for A B testing of different configurations, we have used qt param
> to
> > send request to different handlers.
> > My question here is that on Solr 6.2.6 to enable using 'qt' param I need
> to
> > do handleSelect=false but it is the default request handler on solr
> > administration UI and used as the default endpoint in all the integration
> > tests.
> >
> > It may sound weird but is there a way I can retain both the
> > functionalities?
> > No code changes to integration test code and making qt param work again.
> >
> > Big thanks for any pointers !!
> >
> > Sincerely,
> > Atita
> >
>
>
> --
> Sincerely yours
> Mikhail Khludnev
>


Using QT param with /select

2020-03-02 Thread Atita Arora
Hi,

I am working on improving the search app which is using 'qt' param heavily
to redirect requests to different handlers based on the parameters as
provided by the user.

Also for A B testing of different configurations, we have used qt param to
send request to different handlers.
My question here is that on Solr 6.2.6 to enable using 'qt' param I need to
do handleSelect=false but it is the default request handler on solr
administration UI and used as the default endpoint in all the integration
tests.

It may sound weird but is there a way I can retain both the functionalities?
No code changes to integration test code and making qt param work again.

Big thanks for any pointers !!

Sincerely,
Atita


Re: Solr Relevancy problem

2020-02-19 Thread Atita Arora
+1 for John's reply.
Along with that you can try the debug query to see how the query is being
parsed and what's going wrong.

Hope it helps,
Atita

On Wed, 19 Feb 2020, 09:19 Jörn Franke,  wrote:

> The best way to address this problem is to collect queries and examples
> why they are wrong and to document this. This is especially important when
> working with another vendor. Otherwise no one can give you proper help.
>
> > Am 19.02.2020 um 09:17 schrieb Pradeep Tambade <
> pradeep.tamb...@croma.com.invalid>:
> >
> > Hello,
> >
> > We have configured solr site search engine into our website(
> www.croma.com). We are facing various issues like not showing relevant
> results, free text search not showing  result, phrase keywords shows
> irrelevant results etc
> >
> > Please help us resolve these issues also help to connect with solr tech
> support team or any other company who is expert in managing solr search.
> >
> >
> > Thanks & Regards,
> > Pradeep Tambade |  Assistant Manager - Business Analyst
> > Infiniti Retail Ltd. | A Tata Enterprise
> > Mobile: +91 9664536737
> > Email: pradeep.tamb...@croma.com | Shop at: www.croma.com
> >
> >
> >  Have e-waste but don't know what to do about it?
> >
> >  *   Call us at 7207-666-000 & we pick up your junk at your doorstep
> >  *   We ensure responsible disposal
> >  *   And also plant an actual tree in your name for the e-waste you
> dispose
> >
> > [
> https://www.croma.com/_ui/responsive/common/images/Greatplacetowork_.jpg]
> >
> > Registered Office: Unit No. 701 & 702, 7th Floor, Kaledonia, Sahar Road,
> Andheri East, Mumbai - 400069, India
>


SolrTextTagger with multiple fields

2019-12-16 Thread Atita Arora
Hi,


I went through the SolrTextTagger in Solr, more than it sounds interesting,
I am wondering what are the implications of using multiple tag fields?

The idea is to identify different types of fields in the user query and use
them as filters.
Can anyone direct me to some examples?
Can we include the comma-separated list in the field param of the
requesthandler?

Thank you ,
Atita


Re: Solr master issue : IndexNotFoundException

2019-11-27 Thread Atita Arora
Did you happen to check the permissions , give it a try to add 777, maybe
the user running solr does not have permission to access the index dir.

On Wed, Nov 27, 2019 at 3:45 PM Akreeti Agarwal  wrote:

> Hi,
>
> I removed the write.lock file from the index and then restarted the solr
> server, but still the same issue was observed.
>
> Thanks & Regards,
> Akreeti Agarwal
> (M) +91-8318686601
>
> -Original Message-
> From: Atita Arora 
> Sent: Wednesday, November 27, 2019 7:21 PM
> To: solr-user@lucene.apache.org
> Subject: Re: Solr master issue : IndexNotFoundException
>
> It seems to be either the permission problem or maybe because of the
> write.lock file not removed due to process kill.
>
> Did you happen to check this one ?
>
> https://apc01.safelinks.protection.outlook.com/?url=https%3A%2F%2Flucene.472066.n3.nabble.com%2FSolrCore-collection1-is-not-available-due-to-init-failure-td4094869.htmldata=02%7C01%7CAkreetiA%40hcl.com%7Cdcd1c099bccf4ed715d808d77340ed96%7C189de737c93a4f5a8b686f4ca9941912%7C0%7C0%7C637104595013784239sdata=SGjPkO92vio%2Ff8c5FqinQleRkz4nfor9fRcC5FEw3Ss%3Dreserved=0
>
> On Wed, Nov 27, 2019 at 2:28 PM Akreeti Agarwal  wrote:
>
> > Hi All,
> >
> > I am getting these two errors after restarting my solr master server:
> >
> > null:org.apache.solr.common.SolrException: SolrCore 'sitecore_web_index'
> > is not available due to init failure: Error opening new searcher
> >
> > Caused by: org.apache.lucene.index.IndexNotFoundException: no
> > segments* file found in
> > LockValidatingDirectoryWrapper(NRTCachingDirectory(MMapDirectory@/solr
> > -m/server/solr/sitecore_web_index/data/index
> > lockFactory=org.apache.lucene.store.NativeFSLockFactory@5c6c24fd;
> > maxCacheMB=48.0 maxMergeSizeMB=4.0))
> >
> > Please help to resolve this.
> >
> > Thanks & Regards,
> > Akreeti Agarwal
> > (M) +91-8318686601
> >
> > ::DISCLAIMER::
> > 
> > The contents of this e-mail and any attachment(s) are confidential and
> > intended for the named recipient(s) only. E-mail transmission is not
> > guaranteed to be secure or error-free as information could be
> > intercepted, corrupted, lost, destroyed, arrive late or incomplete, or
> > may contain viruses in transmission. The e mail and its contents (with
> > or without referred errors) shall therefore not attach any liability
> > on the originator or HCL or its affiliates. Views or opinions, if any,
> > presented in this email are solely those of the author and may not
> > necessarily reflect the views or opinions of HCL or its affiliates.
> > Any form of reproduction, dissemination, copying, disclosure,
> > modification, distribution and / or publication of this message
> > without the prior written consent of authorized representative of HCL
> > is strictly prohibited. If you have received this email in error
> > please delete it and notify the sender immediately. Before opening any
> > email and/or attachments, please check them for viruses and other
> defects.
> > 
> >
>


Re: Solr master issue : IndexNotFoundException

2019-11-27 Thread Atita Arora
It seems to be either the permission problem or maybe because of the
write.lock file not removed due to process kill.

Did you happen to check this one ?
https://lucene.472066.n3.nabble.com/SolrCore-collection1-is-not-available-due-to-init-failure-td4094869.html

On Wed, Nov 27, 2019 at 2:28 PM Akreeti Agarwal  wrote:

> Hi All,
>
> I am getting these two errors after restarting my solr master server:
>
> null:org.apache.solr.common.SolrException: SolrCore 'sitecore_web_index'
> is not available due to init failure: Error opening new searcher
>
> Caused by: org.apache.lucene.index.IndexNotFoundException: no segments*
> file found in
> LockValidatingDirectoryWrapper(NRTCachingDirectory(MMapDirectory@/solr-m/server/solr/sitecore_web_index/data/index
> lockFactory=org.apache.lucene.store.NativeFSLockFactory@5c6c24fd;
> maxCacheMB=48.0 maxMergeSizeMB=4.0))
>
> Please help to resolve this.
>
> Thanks & Regards,
> Akreeti Agarwal
> (M) +91-8318686601
>
> ::DISCLAIMER::
> 
> The contents of this e-mail and any attachment(s) are confidential and
> intended for the named recipient(s) only. E-mail transmission is not
> guaranteed to be secure or error-free as information could be intercepted,
> corrupted, lost, destroyed, arrive late or incomplete, or may contain
> viruses in transmission. The e mail and its contents (with or without
> referred errors) shall therefore not attach any liability on the originator
> or HCL or its affiliates. Views or opinions, if any, presented in this
> email are solely those of the author and may not necessarily reflect the
> views or opinions of HCL or its affiliates. Any form of reproduction,
> dissemination, copying, disclosure, modification, distribution and / or
> publication of this message without the prior written consent of authorized
> representative of HCL is strictly prohibited. If you have received this
> email in error please delete it and notify the sender immediately. Before
> opening any email and/or attachments, please check them for viruses and
> other defects.
> 
>


Re: Multi-lingual Search & Accent Marks

2019-08-30 Thread Atita Arora
We work on german index, we neutralize accents before index i.e. umlauts to
'ae', 'ue'.. Etc and similar what we do at the query time too for an
appropriate match.

On Fri, Aug 30, 2019, 4:22 PM Audrey Lorberfeld - audrey.lorberf...@ibm.com
 wrote:

> Hi All,
>
> Just wanting to test the waters here – for those of you with search
> engines that index multiple languages, do you use ASCII-folding in your
> schema? We are onboarding Spanish documents into our index right now and
> keep going back and forth on whether we should preserve accent marks. From
> our query logs, it seems people generally do not include accents when
> searching, but you never know…
>
> Thank you in advance for sharing your experiences!
>
> --
> Audrey Lorberfeld
> Data Scientist, w3 Search
> Digital Workplace Engineering
> CIO, Finance and Operations
> IBM
> audrey.lorberf...@ibm.com
>
>


Re: Multi-language Spellcheck

2019-08-29 Thread Atita Arora
I would agree with the suggestion, I remember something similar presented
by someone at Berlin Buzzwords 19.

On Thu, Aug 29, 2019, 5:03 PM Jörn Franke  wrote:

> It could be sensible to have one spellchecker / language (as different
> endpoint or as a queryparameter at runtime). Alternatively, depending on
> your use case you could get away with a generic fieldtype that does not do
> anything language specific, but I doubt.
>
> > Am 29.08.2019 um 16:20 schrieb Audrey Lorberfeld -
> audrey.lorberf...@ibm.com :
> >
> > Hi All,
> >
> > We are starting up an internal search engine that has to work for many
> different languages. We are starting with a POC of Spanish and English
> documents, and we are using the DirectSolrSpellChecker.
> >
> > From reading others' threads online, I know that we have to have
> multiple spellcheckers to do this (1 for each language). However, would
> someone be able to clarify what should go in the "queryAnalyzerFieldType"
> tag? It seems that the tag can only take a single field. So, does that mean
> that I have to have a copy field that collates all tokens from all
> languages? Image of code attached for reference & sample code of
> English-only spellchecker below:
> >
> > 
> >
> >   ???  
> >
> >
> >default
> >minimal_en
> >solr.DirectSolrSpellChecker -->
> >internal
> >0.5
> >2
> >1
> >5
> >4
> >0.05
> >
> > ...
> >
> > Thank you!
> >
> > --
> > Audrey Lorberfeld
> > Data Scientist, w3 Search
> > Digital Workplace Engineering
> > CIO, Finance and Operations
> > IBM
> > audrey.lorberf...@ibm.com
> >
> >
> > On 8/29/19, 10:12 AM, "Joe Obernberger" 
> wrote:
> >
> >Thank you Erick.  I'm upgrading from 7.6.0 and as far as I can tell
> the
> >schema and configuration (solrconfig.xml) isn't different (apart from
> >the version).  Right now, I'm at a loss.  I still have the 7.6.0
> cluster
> >running and the query works OK there.
> >
> >Sure seems like I'm missing a field called 'features', but it's not
> >defined in the prior schema either.  Thanks again!
> >
> >-Joe
> >
> >>On 8/28/2019 6:19 PM, Erick Erickson wrote:
> >> What it says ;)
> >>
> >> My guess is that your configuration mentions the field “features” in,
> perhaps carrot.snippet or carrot.title.
> >>
> >> But it’s a guess.
> >>
> >> Best,
> >> Erick
> >>
> >>> On Aug 28, 2019, at 5:18 PM, Joe Obernberger <
> joseph.obernber...@gmail.com> wrote:
> >>>
> >>> Hi All - trying to use clustering with SolrCloud 8.2, but getting this
> error:
> >>>
> >>> "msg":"Error from server at null: org.apache.solr.search.SyntaxError:
> Query Field 'features' is not a valid field name",
> >>>
> >>> The URL, I'm using is:
> >>>
> https://urldefense.proofpoint.com/v2/url?u=http-3A__solrServer-3A9100_solr_DOCS_select-3Fq-3D-2A-253A-2A-26qt-3D_clustering-26clustering-3Dtrue-26clustering.collection-3Dtrue=DwIDaQ=jf_iaSHvJObTbx-siA1ZOg=_8ViuZIeSRdQjONA8yHWPZIBlhj291HU3JpNIx5a55M=O_wgAdeSZrC8W73ggxLnVdbVDMeiJ2jnRnzz9zriMWE=Xv6mGAm4OoATTBbEz5m-J0bRyPaUXaVpvWT_f74PIJ4=
>  <
> https://urldefense.proofpoint.com/v2/url?u=http-3A__cronus-3A9100_solr_UNCLASS-5F2018-5F5-5F19-5F184_select-3Fq-3D-2A-253A-2A-26qt-3D_clustering-26clustering-3Dtrue-26clustering.collection-3Dtrue=DwIDaQ=jf_iaSHvJObTbx-siA1ZOg=_8ViuZIeSRdQjONA8yHWPZIBlhj291HU3JpNIx5a55M=O_wgAdeSZrC8W73ggxLnVdbVDMeiJ2jnRnzz9zriMWE=Erwr9WXMf9Vk16cIkTMlhUQrEzKfHYinrWrM40fF1KQ=
> >
> >>>
> >>> Thanks for any ideas!
> >>>
> >>> Complete response:
> >>> {
> >>>  "responseHeader":{
> >>>"zkConnected":true,
> >>>"status":400,
> >>>"QTime":38,
> >>>"params":{
> >>>  "q":"*:*",
> >>>  "qt":"/clustering",
> >>>  "clustering":"true",
> >>>  "clustering.collection":"true"}},
> >>>  "error":{
> >>>"metadata":[
> >>>  "error-class","org.apache.solr.common.SolrException",
> >>>  "root-error-class","org.apache.solr.common.SolrException",
> >>>
> "error-class","org.apache.solr.client.solrj.impl.BaseHttpSolrClient$RemoteSolrException",
> >>>
> "root-error-class","org.apache.solr.client.solrj.impl.BaseHttpSolrClient$RemoteSolrException"],
> >>>"msg":"Error from server at null:
> org.apache.solr.search.SyntaxError: Query Field 'features' is not a valid
> field name",
> >>>"code":400}}
> >>>
> >>>
> >>> -Joe
> >>>
> >>
> >> ---
> >> This email has been checked for viruses by AVG.
> >>
> https://urldefense.proofpoint.com/v2/url?u=https-3A__www.avg.com=DwIDaQ=jf_iaSHvJObTbx-siA1ZOg=_8ViuZIeSRdQjONA8yHWPZIBlhj291HU3JpNIx5a55M=O_wgAdeSZrC8W73ggxLnVdbVDMeiJ2jnRnzz9zriMWE=yqhSyt_b52qGudiP49O1SnlGvlyZCbiNd-fp-ziS-uo=
> >>
> >
> >
> >
>


Re: Index fetch failed

2019-08-28 Thread Atita Arora
This looks like ample memory to get the index chunk.
Also, I looked at the IndexFetcher code, I remember you were using Solr
5.5.5 and the only reason in my view, this would happen is when the index
chunk is not downloaded as can also be seen in the error (Downloaded
0!=123) which clearly states that the index generations are not in sync and
this is not user aborted action too.

Is this error intermittent? could there be a possibility that your master
has connection limits? or maybe some network hiccup?



On Wed, Aug 28, 2019 at 10:40 AM Akreeti Agarwal  wrote:

> Hi,
>
> Memory details for slave1:
>
> Filesystem  Size  Used Avail Use% Mounted on
> /dev/xvda1   99G   40G   55G  43% /
> tmpfs   7.8G 0  7.8G   0% /dev/shm
>
> Memory details for slave2:
>
> Filesystem  Size  Used Avail Use% Mounted on
> /dev/xvda1   99G   45G   49G  48% /
> tmpfs   7.8G 0  7.8G   0% /dev/shm
>
> Thanks & Regards,
> Akreeti Agarwal
>
> -Original Message-
> From: Atita Arora 
> Sent: Wednesday, August 28, 2019 11:15 AM
> To: solr-user@lucene.apache.org
> Subject: Re: Index fetch failed
>
> Hii,
>
> Do you have enough memory free for the index chunk to be
> fetched/Downloaded on the slave node?
>
>
> On Wed, Aug 28, 2019 at 6:57 AM Akreeti Agarwal  wrote:
>
> > Hello Everyone,
> >
> > I am getting this error continuously on Solr slave, can anyone tell me
> > the solution for this:
> >
> > 642141666 ERROR (indexFetcher-72-thread-1) [   x:sitecore_web_index]
> > o.a.s.h.ReplicationHandler Index fetch failed
> > :org.apache.solr.common.SolrException: Unable to download _12i7v_f.liv
> > completely. Downloaded 0!=123
> >  at
> >
> org.apache.solr.handler.IndexFetcher$FileFetcher.cleanup(IndexFetcher.java:1434)
> >  at
> >
> org.apache.solr.handler.IndexFetcher$FileFetcher.fetchFile(IndexFetcher.java:1314)
> >  at
> >
> org.apache.solr.handler.IndexFetcher.downloadIndexFiles(IndexFetcher.java:812)
> >  at
> > org.apache.solr.handler.IndexFetcher.fetchLatestIndex(IndexFetcher.jav
> > a:427)
> >
> >
> > Thanks & Regards,
> > Akreeti Agarwal
> > (M) +91-8318686601
> >
> > ::DISCLAIMER::
> >
> > --
> > --
> > --
> > 
> > The contents of this e-mail and any attachment(s) are confidential and
> > intended for the named recipient(s) only. E-mail transmission is not
> > guaranteed to be secure or error-free as information could be
> > intercepted, corrupted, lost, destroyed, arrive late or incomplete, or
> > may contain viruses in transmission. The e mail and its contents (with
> > or without referred errors) shall therefore not attach any liability
> > on the originator or HCL or its affiliates. Views or opinions, if any,
> > presented in this email are solely those of the author and may not
> > necessarily reflect the views or opinions of HCL or its affiliates.
> > Any form of reproduction, dissemination, copying, disclosure,
> > modification, distribution and / or publication of this message
> > without the prior written consent of authorized representative of HCL
> > is strictly prohibited. If you have received this email in error
> > please delete it and notify the sender immediately. Before opening any
> > email and/or attachments, please check them for viruses and other
> defects.
> >
> > --
> >
> 
> >
>


Re: Index fetch failed

2019-08-27 Thread Atita Arora
Hii,

Do you have enough memory free for the index chunk to be fetched/Downloaded
on the slave node?


On Wed, Aug 28, 2019 at 6:57 AM Akreeti Agarwal  wrote:

> Hello Everyone,
>
> I am getting this error continuously on Solr slave, can anyone tell me the
> solution for this:
>
> 642141666 ERROR (indexFetcher-72-thread-1) [   x:sitecore_web_index]
> o.a.s.h.ReplicationHandler Index fetch failed
> :org.apache.solr.common.SolrException: Unable to download _12i7v_f.liv
> completely. Downloaded 0!=123
>  at
> org.apache.solr.handler.IndexFetcher$FileFetcher.cleanup(IndexFetcher.java:1434)
>  at
> org.apache.solr.handler.IndexFetcher$FileFetcher.fetchFile(IndexFetcher.java:1314)
>  at
> org.apache.solr.handler.IndexFetcher.downloadIndexFiles(IndexFetcher.java:812)
>  at
> org.apache.solr.handler.IndexFetcher.fetchLatestIndex(IndexFetcher.java:427)
>
>
> Thanks & Regards,
> Akreeti Agarwal
> (M) +91-8318686601
>
> ::DISCLAIMER::
>
> --
> The contents of this e-mail and any attachment(s) are confidential and
> intended for the named recipient(s) only. E-mail transmission is not
> guaranteed to be secure or error-free as information could be intercepted,
> corrupted, lost, destroyed, arrive late or incomplete, or may contain
> viruses in transmission. The e mail and its contents (with or without
> referred errors) shall therefore not attach any liability on the originator
> or HCL or its affiliates. Views or opinions, if any, presented in this
> email are solely those of the author and may not necessarily reflect the
> views or opinions of HCL or its affiliates. Any form of reproduction,
> dissemination, copying, disclosure, modification, distribution and / or
> publication of this message without the prior written consent of authorized
> representative of HCL is strictly prohibited. If you have received this
> email in error please delete it and notify the sender immediately. Before
> opening any email and/or attachments, please check them for viruses and
> other defects.
>
> --
>


Re: Solr master issue

2019-08-21 Thread Atita Arora
I think it would be useful to provide additional details like solr version,
when did you start seeing the problem? Did you upgrade lately?
Did you change anything?

On Wed, Aug 21, 2019 at 2:17 PM Akreeti Agarwal  wrote:

> Hi,
>
> I am facing issue on my solr master, getting this in logs:
>
> 2019-08-19 22:29:55.573 ERROR (qtp59559151-101) [   ] o.a.s.s.HttpSolrCall
> null:org.apache.solr.common.SolrException: SolrCore
> 'sitecore_contents_index' is not available due to init failure: Error
> opening new searcher
> at
> org.apache.solr.core.CoreContainer.getCore(CoreContainer.java:1071)
> at
> org.apache.solr.servlet.HttpSolrCall.init(HttpSolrCall.java:252)
> at
> org.apache.solr.servlet.HttpSolrCall.call(HttpSolrCall.java:414)
> at
> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:257)
> at
> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:208)
> at
> org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1652)
> at
> org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:585)
> at
> org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:143)
> at
> org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:577)
> at
> org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:223)
> at
> org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1127)
> at
> org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:515)
> at
> org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:185)
> at
> org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1061)
> at
> org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:141)
> at
> org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:215)
> at
> org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerCollection.java:110)
> at
> org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:97)
> at org.eclipse.jetty.server.Server.handle(Server.java:499)
> at
> org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.java:310)
> at
> org.eclipse.jetty.server.HttpConnection.onFillable(HttpConnection.java:257)
> at org.eclipse.jetty.io
> .AbstractConnection$2.run(AbstractConnection.java:540)
> at
> org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:635)
> at
> org.eclipse.jetty.util.thread.QueuedThreadPool$3.run(QueuedThreadPool.java:555)
> at java.lang.Thread.run(Thread.java:748)
> Caused by: org.apache.solr.common.SolrException: Error opening new searcher
> at org.apache.solr.core.SolrCore.(SolrCore.java:820)
> at org.apache.solr.core.SolrCore.(SolrCore.java:658)
> at
> org.apache.solr.core.CoreContainer.create(CoreContainer.java:820)
> at
> org.apache.solr.core.CoreContainer.access$000(CoreContainer.java:90)
> at
> org.apache.solr.core.CoreContainer$1.call(CoreContainer.java:473)
> at
> org.apache.solr.core.CoreContainer$1.call(CoreContainer.java:464)
> at java.util.concurrent.FutureTask.run(FutureTask.java:266)
> at
> org.apache.solr.common.util.ExecutorUtil$MDCAwareThreadPoolExecutor$1.run(ExecutorUtil.java:231)
> at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
> ... 1 more
> Caused by: org.apache.solr.common.SolrException: Error opening new searcher
> at
> org.apache.solr.core.SolrCore.openNewSearcher(SolrCore.java:1696)
> at
> org.apache.solr.core.SolrCore.getSearcher(SolrCore.java:1807)
> at
> org.apache.solr.core.SolrCore.initSearcher(SolrCore.java:914)
> at org.apache.solr.core.SolrCore.(SolrCore.java:793)
> ... 10 more
> Caused by: java.nio.file.NoSuchFileException:
> /solr-m/server/solr/sitecore_contents_index_rebuild/data/index/_6lps.si
> at
> sun.nio.fs.UnixException.translateToIOException(UnixException.java:86)
> at
> sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:102)
> at
> sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:107)
> at
> sun.nio.fs.UnixFileSystemProvider.newFileChannel(UnixFileSystemProvider.java:177)
> at 

Re: modify query response plugin

2019-08-08 Thread Atita Arora
Isn't it resolved by simply adding the desired pre/post tags in highlighter
request?

On Thu, Aug 8, 2019 at 11:20 PM Moyer, Brett  wrote:

> Highlight? What about using the Highlighter?
> https://lucene.apache.org/solr/guide/6_6/highlighting.html
>
> Brett Moyer
> Manager, Sr. Technical Lead | TFS Technology
>   Public Production Support
>   Digital Search & Discovery
>
> 8625 Andrew Carnegie Blvd | 4th floor
> Charlotte, NC 28263
> Tel: 704.988.4508
> Fax: 704.988.4907
> bmo...@tiaa.org
>
>
> -Original Message-
> From: Maria Muslea 
> Sent: Thursday, August 8, 2019 1:28 PM
> To: solr-user@lucene.apache.org
> Subject: Re: modify query response plugin
>
> Thank you for your response. I believe that the Tagger is used for NER,
> which is different than what I am trying to do.
> It is also available only with Solr 7 and I would need this to work with
> version 6.5.0.
>
> I am trying to manipulate the data that I already have in the response,
> and I can't find a good example of a plugin that does something similar, so
> I can see how I can access the response and construct a new one.
>
> Your help is greatly appreciated.
>
> Thank you,
> Maria
>
> On Tue, Aug 6, 2019 at 3:19 PM Erik Hatcher 
> wrote:
>
> > I think you’re looking for the Solr Tagger, described here:
> > https://lucidworks.com/post/solr-tagger-improving-relevancy/
> >
> > > On Aug 6, 2019, at 16:04, Maria Muslea  wrote:
> > >
> > > Hi,
> > >
> > > I am trying to implement a plugin that will modify my query
> > > response. For example, I would like to execute a query that will
> return something like:
> > >
> > > {...
> > > "description":"flights at LAX",
> > > "highlight":"airport;11;3"
> > > ...}
> > > This is information that I have in my document, so I can return it.
> > >
> > > Now, I would like the plugin to intercept the result, do some
> > > processing
> > on
> > > it, and return something like:
> > >
> > > {...
> > > "description":"flights at LAX",
> > > "highlight":{
> > >   "concept":"airport",
> > >   "description":"flights at LAX"
> > > ...}
> > >
> > > I looked at some RequestHandler implementations, but I can't find
> > > any sample code that would help me with this. Would this type of
> > > plugin be handled by a RequestHandler? Could you maybe point me to a
> > > sample plugin that does something similar?
> > >
> > > I would really appreciate your help.
> > >
> > > Thank you,
> > > Maria
> >
> *
> This e-mail may contain confidential or privileged information.
> If you are not the intended recipient, please notify the sender
> immediately and then delete it.
>
> TIAA
> *
>


Re: Mismatch between replication API & index.properties

2019-08-01 Thread Atita Arora
I am not sure but just guessing is this node acting as a repeater?

This seems legitimate as Jai mentioned above, the discrepancy could be
because of unsuccessful replication due to disk space constraints.


On Thu, Aug 1, 2019 at 6:19 AM Aman Tandon  wrote:

> Yes, that is what my understanding is but if you see the Replication
> handler response it is saying it is referring to the index folder not to
> the one shown in index.properties. Due to that confusion I am not able to
> delete the folder.
>
> Is this some bug or default behavior where irrespective of the
> index.properties it will always shows the index folder only.
>
> Solr version - 6.6.2
>
> On Wed, Jul 31, 2019, 21:17 jai dutt  wrote:
>
> > It's correct behaviour , Solr put replica index file in this format only
> > and you can find latest index pointing in index.properties file. Usually
> > afer successful full replication Solr remove old timestamp dir.
> >
> > On Wed, 31 Jul, 2019, 8:02 PM Aman Tandon, 
> > wrote:
> >
> > > Hi,
> > >
> > > We are having a situation where whole disk space is full and in server
> > > where we are seeing the multiple index directories ending with the
> > > timestamp. Upon checking the index.properties file for a particular
> shard
> > > replica, it is not referring to the folder name *index *but when I am
> > using
> > > the replication API I am seeing it is pointing to *index *folder. Am I
> > > missing something? Kindly advise.
> > >
> > > *directory*
> > >
> > >
> > >
> > > *drwxrwxr-x. 2 fusion fusion 69632 Jul 30 23:24 indexdrwxrwxr-x. 2
> fusion
> > > fusion 28672 Jul 31 03:02 index.20190731005047763drwxrwxr-x. 2 fusion
> > > fusion  4096 Jul 31 10:20 index.20190731095757917*
> > > -rw-rw-r--. 1 fusion fusion78  Jul 31 03:02 index.properties
> > > -rw-rw-r--. 1 fusion fusion   296 Jul 31 09:56
> replication.properties
> > > drwxrwxr-x. 2 fusion fusion  4096 Jan 16  2019 snapshot_metadata
> > > drwxrwxr-x. 2 fusion fusion  4096 Jul 30 23:24 tlog
> > >
> > > *index.properties*
> > >
> > > #index.properties
> > > #Wed Jul 31 03:02:12 EDT 2019
> > > index=index.20190731005047763
> > >
> > > *REPLICATION API STATUS*
> > >
> > > 
> > > 280.56 GB
> > > 
> > > */opt/solr/x_shard4_replica3/data/index/*
> > > 
> > > ...
> > > true
> > > false
> > > 1564543395563
> > > 98884
> > > ...
> > > ...
> > >
> > > Regards,
> > > Aman
> > >
> >
>


Re: Very low filter cache hit ratio

2019-05-29 Thread Atita Arora
You can refer to this one:
https://teaspoon-consulting.com/articles/solr-cache-tuning.html

HTH,
Atita

On Wed, May 29, 2019 at 3:33 PM Saurabh Sharma 
wrote:

> Hi Shwan,
>
> Many filters are common among the queries. AFAIK, filter cache are created
> against filters and by that logic one should get good hit ratio for those
> cached filter conditions.i tried to create a cache of 100K size and that
> too was not producing good hit ratio. Any document/suggetion about
> efficient usage of various caches  and their internal working.
>
> Thanks
> Saurabh
>
> On Wed 29 May, 2019, 6:53 PM Shawn Heisey,  wrote:
>
> > On 5/29/2019 6:57 AM, Saurabh Sharma wrote:
> > > What can be the possible reasons for low cache usage?
> > > How can I leverage cache feature for high traffic indexes?
> >
> > Your usage apparently does not use the exact same query (or filter
> > query, in the case of filterCache) very often.
> >
> > In order to achieve a high hit ratio on a cache, the same query will
> > need to be used by many users.  That's not happening here.  I'm betting
> > that each user is sending something unique to Solr - which means it will
> > be impossible to get a hit, unless that user sends the same query again.
> >
> > Thanks,
> > Shawn
> >
>


Re: Retrieving docs in the same order as provided in the query

2019-05-09 Thread Atita Arora
Sure,
I can give this a shot! Hope it works out well for bigger resultsets too :)

Big Thanks, Erik :)



On Thu, May 9, 2019 at 3:20 PM Erik Hatcher  wrote:

> Atita -
>
> You mean something like q=id:(X Y Z) to be able to order them arbitrarily?
>
> Yes, you can use the constant score query syntax to set the score, e.g.:
>
>q=id:Z^=3 OR id:Y^=2 OR id:X^=1
>
> Hope that helps.
>
> Erik
>
>
> > On May 9, 2019, at 8:55 AM, Atita Arora  wrote:
> >
> > Hi,
> >
> > Is there someway way to retrieve the docs in the same order as queried in
> > the solr query?
> >
> > I am aware of leveraging bq for this and have even tried overriding
> custom
> > similarity to achieve this but I am looking for something simpler.
> >
> > Please enlighten me.
> >
> > Best Regards,
> > Atita
>
>


Retrieving docs in the same order as provided in the query

2019-05-09 Thread Atita Arora
Hi,

 Is there someway way to retrieve the docs in the same order as queried in
the solr query?

I am aware of leveraging bq for this and have even tried overriding custom
similarity to achieve this but I am looking for something simpler.

Please enlighten me.

Best Regards,
Atita


Re: Need information on EofExceptions in solr 4.8.1

2019-03-19 Thread Atita Arora
Precisely, socketTimeout ms if it's your indexing pipeline code.
We faced this when our docs were unusually bigger.

On Tue, Mar 19, 2019 at 7:08 PM Saurabh Sharma 
wrote:

> Hi,
>
> Seems like it is not a problem with solr. It is happening due to stream
> termination at the jetty.
> Please make sure your client is not setting very low read timeout. You can
> also increase max sessions timeout and idleTimeout at jetty level.
>
> Thanks
> Saurabh Sharma
>
> On Tue, Mar 19, 2019 at 11:19 PM Vijay Rawlani 
> wrote:
>
> > Dear Concerned,
> >
> > We are using solr 4.8.1 in our project. We are observing following
> > EofExceptions in solr.
> > It would be helpful for us to know in what situations we might land up
> > with this.
> > Can we get rid of this with any solr configuration or is there any way
> > forward at all?
> > Kindly let us know some information about the exception and the scenario
> > where it can occur.
> >
> > 019-03-17T00:00:25.604457+00:00@solr@UNKNOWN@
> > org.apache.solr.servlet.SolrDispatchFilter:120 -
> > null:org.eclipse.jetty.io.EofException#012#011at
> > org.eclipse.jetty.server.HttpOutput.write(HttpOutput.java:142)#012#011at
> > org.eclipse.jetty.server.HttpOutput.write(HttpOutput.java:107)#012#011at
> >
> org.apache.solr.common.util.FastOutputStream.flush(FastOutputStream.java:214)#012#011at
> >
> >
> org.apache.solr.common.util.FastOutputStream.flushBuffer(FastOutputStream.java:207)#012#011at
> >
> >
> org.apache.solr.common.util.JavaBinCodec.marshal(JavaBinCodec.java:98)#012#011at
> >
> >
> org.apache.solr.response.BinaryResponseWriter.write(BinaryResponseWriter.java:51)#012#011at
> >
> >
> org.apache.solr.servlet.SolrDispatchFilter.writeResponse(SolrDispatchFilter.java:755)#012#011at
> >
> >
> >
> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:431)#012...
> >
> > 2019-03-17T00:00:25.604457+00:00@solr@[INVALID_PROGRAM]@
> >
> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:207)#012#011at
> >
> >
> org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1419)#012#011at
> >
> >
> org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:455)#012#011at
> >
> >
> org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:137)#012#011at
> >
> >
> org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:557)#012#011at
> >
> >
> org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:231)#012#011at
> >
> >
> org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1075)#012#011at
> >
> >
> org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:384)#012#011at
> >
> > org.eclipse.jetty.server.session.SessionHandler.doScope(Sess...
> >
> > 2019-03-17T00:00:25.604457+00:00@solr@[INVALID_PROGRAM]@193)#012#011at
> >
> org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1009)#012#011at
> >
> >
> org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:135)#012#011at
> >
> >
> org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:255)#012#011at
> >
> >
> org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerCollection.java:154)#012#011at
> >
> >
> org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:116)#012#011at
> >
> > org.eclipse.jetty.server.Server.handle(Server.java:368)#012#011at
> >
> org.eclipse.jetty.server.AbstractHttpConnection.handleRequest(AbstractHttpConnection.java:489)#012#011at
> >
> >
> org.eclipse.jetty.server.BlockingHttpConnection.handleRequest(BlockingHttpConnection.java:53)#012#011at
> >
> > org.e...
> >
> > 2019-03-17T00:00:25.604457+00:00@solr@[INVALID_PROGRAM]@953)#012#011at
> >
> org.eclipse.jetty.server.AbstractHttpConnection$RequestHandler.content(AbstractHttpConnection.java:1014)#012#011at
> >
> >
> org.eclipse.jetty.http.HttpParser.parseNext(HttpParser.java:861)#012#011at
> >
> org.eclipse.jetty.http.HttpParser.parseAvailable(HttpParser.java:240)#012#011at
> >
> >
> org.eclipse.jetty.server.BlockingHttpConnection.handle(BlockingHttpConnection.java:72)#012#011at
> >
> >
> org.eclipse.jetty.server.bio.SocketConnector$ConnectorEndPoint.run(SocketConnector.java:264)#012#011at
> >
> >
> org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:608)#012#011at
> >
> >
> org.eclipse.jetty.util.thread.QueuedThreadPool$3.run(QueuedThreadPool.java:543)#012#011at
> >
> > java.lang.Thread.run(Thread.java:748)#012
> >
> > 2019-03-17T00:00:25.604457+00:00@solr@UNKNOWN@
> > org.eclipse.jetty.server.Response:312 - Committed before 500
> > {trace=org.eclipse.jetty.io.EofException#012#011at
> > org.eclipse.jetty.server.HttpOutput.write(HttpOutput.java:142)#012#011at
> > org.eclipse.jetty.server.HttpOutput.write(HttpOutput.java:107)#012#011at
> >
> org.apache.solr.common.util.FastOutputStream.flush(FastOutputStream.java:214)#012#011at
> >
> >
> 

Re: Solr thread dump analysis

2019-01-26 Thread Atita Arora
I believe you can try using gceasy.io.
It's pretty decent and gives you a comprehensive analysis of what's going
on and what can be altered.
Upload you GC logs and Try it!

On Sat, Jan 26, 2019 at 8:29 PM Rajdeep Sahoo 
wrote:

> Hi all,
> How can I analyse solr thread dump and how the thread dump analysis can be
> helpful for improving solr performance. Please suggest
>


Re: REBALANCELEADERS is not reliable

2018-11-29 Thread Atita Arora
Indeed, I tried that on 7.4 & 7.5 too, indeed did not work for me as well,
even with the preferredLeader property as recommended in the documentation.
I handled it with a little hack but certainly this dint work as expected.
I can provide more details if there's a ticket.

On Thu, Nov 29, 2018 at 8:42 PM Aman Tandon  wrote:

> ++ correction
>
> On Fri, Nov 30, 2018, 01:10 Aman Tandon 
> > For me today, I deleted the leader replica of one of the two shard
> > collection. Then other replicas of that shard wasn't getting elected for
> > leader.
> >
> > After waiting for long tried the setting addreplicaprop preferred leader
> > on one of the replica then tried FORCELEADER but no luck. Then also tried
> > rebalance but no help. Finally have to recreate the whole collection.
> >
> > Not sure what was the issue but both FORCELEADER AND REBALANCING didn't
> > work if there was no leader however preferred leader property was setted.
> >
> > On Wed, Nov 28, 2018, 12:54 Bernd Fehling <
> bernd.fehl...@uni-bielefeld.de
> > wrote:
> >
> >> Hi Vadim,
> >>
> >> thanks for confirming.
> >> So it seems to be a general problem with Solr 6.x, 7.x and might
> >> be still there in the most recent versions.
> >>
> >> But where to start to debug this problem, is it something not
> >> correctly stored in zookeeper or is overseer the problem?
> >>
> >> I was also reading something about a "leader queue" where possible
> >> leaders have to be requeued or something similar.
> >>
> >> May be I should try to get a situation where a "locked" core
> >> is on the overseer and then connect the debugger to it and step
> >> through it.
> >> Peeking and poking around, like old Commodore 64 days :-)
> >>
> >> Regards, Bernd
> >>
> >>
> >> Am 27.11.18 um 15:47 schrieb Vadim Ivanov:
> >> > Hi, Bernd
> >> > I have tried REBALANCELEADERS with Solr 6.3 and 7.5
> >> > I had very similar results and notion that it's not reliable :(
> >> > --
> >> > Br, Vadim
> >> >
> >> >> -Original Message-
> >> >> From: Bernd Fehling [mailto:bernd.fehl...@uni-bielefeld.de]
> >> >> Sent: Tuesday, November 27, 2018 5:13 PM
> >> >> To: solr-user@lucene.apache.org
> >> >> Subject: REBALANCELEADERS is not reliable
> >> >>
> >> >> Hi list,
> >> >>
> >> >> unfortunately REBALANCELEADERS is not reliable and the leader
> >> >> election has unpredictable results with SolrCloud 6.6.5 and
> >> >> Zookeeper 3.4.10.
> >> >> Seen with 5 shards / 3 replicas.
> >> >>
> >> >> - CLUSTERSTATUS reports all replicas (core_nodes) as state=active.
> >> >> - setting with ADDREPLICAPROP the property preferredLeader to other
> >> replicas
> >> >> - calling REBALANCELEADERS
> >> >> - some leaders have changed, some not.
> >> >>
> >> >> I then tried:
> >> >> - removing all preferredLeader properties from replicas which
> >> succeeded.
> >> >> - trying again REBALANCELEADERS for the rest. No success.
> >> >> - Shutting down nodes to force the leader to a specific replica left
> >> running.
> >> >>No success.
> >> >> - calling REBALANCELEADERS responds that the replica is inactive!!!
> >> >> - calling CLUSTERSTATUS reports that the replica is active!!!
> >> >>
> >> >> Also, the replica which don't want to become leader is not in the
> list
> >> >> of collections->[collection_name]->leader_elect->shard1..x->election
> >> >>
> >> >> Where is CLUSTERSTATUS getting it's state info from?
> >> >>
> >> >> Has anyone else problems with REBALANCELEADERS?
> >> >>
> >> >> I noticed that the Reference Guide writes "preferredLeader" (with
> >> capital "L")
> >> >> but the JAVA code has "preferredleader".
> >> >>
> >> >> Regards, Bernd
> >> >
> >>
> >
>


Re: Query to multiple collections

2018-10-25 Thread Atita Arora
Hi,

This kind of was one of the problems I was facing recently.
While in my use case I am supposed to be showing spellcheck suggestions
(collated) from two different collections.
To also mention both these collections are using the same schema while they
need to be segregated as for the business nature they serve.

I considered using the aliasing approach too, while was little unsure if
this might work for me.
Weirdly the standard select URL itself is a trouble for me and I run into
the following exception on my browser :

http://:8983/solr/products.1,products.3/select?q=*:*

{
  "responseHeader": {
"zkConnected": true,
"status": 500,
"QTime": 24,
"params": {
  "q": "*:*"
}
  },
  "error": {
"trace": "java.lang.NullPointerException\n\tat
org.apache.solr.handler.component.QueryComponent.unmarshalSortValues(QueryComponent.java:1034)\n\tat
org.apache.solr.handler.component.QueryComponent.mergeIds(QueryComponent.java:885)\n\tat
org.apache.solr.handler.component.QueryComponent.handleRegularResponses(QueryComponent.java:585)\n\tat
org.apache.solr.handler.component.QueryComponent.handleResponses(QueryComponent.java:564)\n\tat
org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:423)\n\tat
org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:177)\n\tat
org.apache.solr.core.SolrCore.execute(SolrCore.java:2503)\n\tat
org.apache.solr.servlet.HttpSolrCall.execute(HttpSolrCall.java:710)\n\tat
org.apache.solr.servlet.HttpSolrCall.call(HttpSolrCall.java:516)\n\tat
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:382)\n\tat
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:326)\n\tat
org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1751)\n\tat
org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:582)\n\tat
org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:143)\n\tat
org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:548)\n\tat
org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:226)\n\tat
org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1180)\n\tat
org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:512)\n\tat
org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:185)\n\tat
org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1112)\n\tat
org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:141)\n\tat
org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:213)\n\tat
org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerCollection.java:119)\n\tat
org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:134)\n\tat
org.eclipse.jetty.rewrite.handler.RewriteHandler.handle(RewriteHandler.java:335)\n\tat
org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:134)\n\tat
org.eclipse.jetty.server.Server.handle(Server.java:534)\n\tat
org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.java:320)\n\tat
org.eclipse.jetty.server.HttpConnection.onFillable(HttpConnection.java:251)\n\tat
org.eclipse.jetty.io.AbstractConnection$ReadCallback.succeeded(AbstractConnection.java:283)\n\tat
org.eclipse.jetty.io.FillInterest.fillable(FillInterest.java:108)\n\tat
org.eclipse.jetty.io.SelectChannelEndPoint$2.run(SelectChannelEndPoint.java:93)\n\tat
org.eclipse.jetty.util.thread.strategy.ExecuteProduceConsume.executeProduceConsume(ExecuteProduceConsume.java:303)\n\tat
org.eclipse.jetty.util.thread.strategy.ExecuteProduceConsume.produceConsume(ExecuteProduceConsume.java:148)\n\tat
org.eclipse.jetty.util.thread.strategy.ExecuteProduceConsume.run(ExecuteProduceConsume.java:136)\n\tat
org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:671)\n\tat
org.eclipse.jetty.util.thread.QueuedThreadPool$2.run(QueuedThreadPool.java:589)\n\tat
java.lang.Thread.run(Thread.java:748)\n",
"code": 500
  }
}

I would really appreciate if someone could possibly tell me what could be
happening?

Thanks,
Atita

On Tue, Oct 23, 2018 at 1:58 AM Rohan Kasat  wrote:

> Thanks Shawn for the update.
> I am going ahead with the standard aliases approach , suits my use case.
>
> Regards,
> Rohan Kasat
>
>
> On Mon, Oct 22, 2018 at 4:49 PM Shawn Heisey  wrote:
>
> > On 10/22/2018 1:26 PM, Chris Ulicny wrote:
> > > There weren't any particular problems we ran into since the client that
> > > makes the queries to multiple collections previously would query
> multiple
> > > cores using the 'shards' parameter before we moved to solrcloud. We
> > didn't
> > > have any complicated sorting or scoring requirements fortunately.
> > >
> > > The one thing I remember looking into was what solr would do when two
> > > documents with the same id were found in both collections. I believe it
> > > just 

Re: Integrate nutch with solr

2018-10-22 Thread Atita Arora
and
https://lobster1234.github.io/2017/08/14/search-with-nutch-mongodb-solr/

On Tue, Oct 23, 2018 at 1:12 AM Atita Arora  wrote:

> I think this should be kind of useful :
>
>
> https://blog.building-blocks.com/building-a-search-engine-with-nutch-and-solr-in-10-minutes/
>
> I integrated Aperture with Solr way back in 2008.
>
> On Mon, Oct 22, 2018 at 11:27 PM Dinesh Sundaram 
> wrote:
>
>> Thanks Shawn for the reply, yes I do have some questions on the solr too.
>> can you please share the steps for solr side to integate the nutch or no
>> steps are needed in solr?
>>
>> On Thu, Oct 18, 2018 at 8:35 PM Shawn Heisey  wrote:
>>
>> > On 10/18/2018 12:35 PM, Dinesh Sundaram wrote:
>> > > Can you please share the steps to integrate nutch 2.3.1 with solrcloud
>> > > 7.1.0.
>> >
>> > You will need to speak to the nutch project about how to configure their
>> > software to interact with Solr.  If you have questions about Solr
>> > itself, we can answer those.
>> >
>> > http://nutch.apache.org/mailing_lists.html
>> >
>> > Thanks,
>> > Shawn
>> >
>> >
>>
>


Re: Integrate nutch with solr

2018-10-22 Thread Atita Arora
I think this should be kind of useful :

https://blog.building-blocks.com/building-a-search-engine-with-nutch-and-solr-in-10-minutes/

I integrated Aperture with Solr way back in 2008.

On Mon, Oct 22, 2018 at 11:27 PM Dinesh Sundaram 
wrote:

> Thanks Shawn for the reply, yes I do have some questions on the solr too.
> can you please share the steps for solr side to integate the nutch or no
> steps are needed in solr?
>
> On Thu, Oct 18, 2018 at 8:35 PM Shawn Heisey  wrote:
>
> > On 10/18/2018 12:35 PM, Dinesh Sundaram wrote:
> > > Can you please share the steps to integrate nutch 2.3.1 with solrcloud
> > > 7.1.0.
> >
> > You will need to speak to the nutch project about how to configure their
> > software to interact with Solr.  If you have questions about Solr
> > itself, we can answer those.
> >
> > http://nutch.apache.org/mailing_lists.html
> >
> > Thanks,
> > Shawn
> >
> >
>


Re: SPLITSHARD throwing OutOfMemory Error

2018-10-04 Thread Atita Arora
Hi Andrzej,

We're rather weighing on a lot of other stuff to upgrade our Solr for a
very long time like better authentication handling, backups using CDCR, new
Replication mode and this probably has just given us another reason to
upgrade.
Thank you so much for the suggestion, I think its good to know about
something like this exists. We'll find out more about this.

Great day ahead!

Regards,
Atita



On Thu, Oct 4, 2018 at 11:28 AM Andrzej Białecki  wrote:

> I know it’s not much help if you’re stuck with Solr 6.1 … but Solr 7.5
> comes with an alternative strategy for SPLITSHARD that doesn’t consume as
> much memory and nearly doesn’t consume additional disk space on the leader.
> This strategy can be turned on by “splitMethod=link” parameter.
>
> > On 4 Oct 2018, at 10:23, Atita Arora  wrote:
> >
> > Hi Edwin,
> >
> > Thanks for following up on this.
> >
> > So here are the configs :
> >
> > Memory - 30G - 20 G to Solr
> > Disk - 1TB
> > Index = ~ 500G
> >
> > and I think that it possibly is due to the reason why this could be
> > happening is that during split shard, the unsplit index + split index
> > persists on the instance and may be causing this.
> > I actually tried splitshard on another instance with index size 64G and
> it
> > went through without any issues.
> >
> > I would appreciate if you have additional information to enlighten me on
> > this issue.
> >
> > Thanks again.
> >
> > Regards,
> >
> > Atita
> >
> > On Thu, Oct 4, 2018 at 9:47 AM Zheng Lin Edwin Yeo  >
> > wrote:
> >
> >> Hi Atita,
> >>
> >> What is the amount of memory that you have in your system?
> >> And what is your index size?
> >>
> >> Regards,
> >> Edwin
> >>
> >> On Tue, 25 Sep 2018 at 22:39, Atita Arora  wrote:
> >>
> >>> Hi,
> >>>
> >>> I am working on a test setup with Solr 6.1.0 cloud with 1 collection
> >>> sharded across 2 shards with no replication. When triggered a
> SPLITSHARD
> >>> command it throws "java.lang.OutOfMemoryError: Java heap space"
> >> everytime.
> >>> I tried this with multiple heap settings of 8, 12 & 20G but every time
> it
> >>> does create 2 sub-shards but then fails eventually.
> >>> I know the issue => https://jira.apache.org/jira/browse/SOLR-5214 has
> >> been
> >>> resolved but the trace looked very similar to this one.
> >>> Also just to ensure that I do not run into exceptions due to merge as
> >>> reported in this ticket, I also tried running optimize before
> proceeding
> >>> with splitting the shard.
> >>> I issued the following commands :
> >>>
> >>> 1.
> >>>
> >>>
> >>
> http://localhost:8983/solr/admin/collections?collection=testcollection=shard1=SPLITSHARD
> >>>
> >>> This threw java.lang.OutOfMemoryError: Java heap space
> >>>
> >>> 2.
> >>>
> >>>
> >>
> http://localhost:8983/solr/admin/collections?collection=testcollection=shard1=SPLITSHARD=1000
> >>>
> >>> Then I ran with async=1000 and checked the status. Every time It's
> >> creating
> >>> the sub shards, but not splitting the index.
> >>>
> >>> Is there something that I am not doing correctly?
> >>>
> >>> Please guide.
> >>>
> >>> Thanks,
> >>> Atita
> >>>
> >>
>
> —
>
> Andrzej Białecki
>
>


Re: SPLITSHARD throwing OutOfMemory Error

2018-10-04 Thread Atita Arora
Hi Edwin,

Thanks for following up on this.

So here are the configs :

Memory - 30G - 20 G to Solr
Disk - 1TB
Index = ~ 500G

and I think that it possibly is due to the reason why this could be
happening is that during split shard, the unsplit index + split index
persists on the instance and may be causing this.
I actually tried splitshard on another instance with index size 64G and it
went through without any issues.

I would appreciate if you have additional information to enlighten me on
this issue.

Thanks again.

Regards,

Atita

On Thu, Oct 4, 2018 at 9:47 AM Zheng Lin Edwin Yeo 
wrote:

> Hi Atita,
>
> What is the amount of memory that you have in your system?
> And what is your index size?
>
> Regards,
> Edwin
>
> On Tue, 25 Sep 2018 at 22:39, Atita Arora  wrote:
>
> > Hi,
> >
> > I am working on a test setup with Solr 6.1.0 cloud with 1 collection
> > sharded across 2 shards with no replication. When triggered a SPLITSHARD
> > command it throws "java.lang.OutOfMemoryError: Java heap space"
> everytime.
> > I tried this with multiple heap settings of 8, 12 & 20G but every time it
> > does create 2 sub-shards but then fails eventually.
> > I know the issue => https://jira.apache.org/jira/browse/SOLR-5214 has
> been
> > resolved but the trace looked very similar to this one.
> > Also just to ensure that I do not run into exceptions due to merge as
> > reported in this ticket, I also tried running optimize before proceeding
> > with splitting the shard.
> > I issued the following commands :
> >
> > 1.
> >
> >
> http://localhost:8983/solr/admin/collections?collection=testcollection=shard1=SPLITSHARD
> >
> > This threw java.lang.OutOfMemoryError: Java heap space
> >
> > 2.
> >
> >
> http://localhost:8983/solr/admin/collections?collection=testcollection=shard1=SPLITSHARD=1000
> >
> > Then I ran with async=1000 and checked the status. Every time It's
> creating
> > the sub shards, but not splitting the index.
> >
> > Is there something that I am not doing correctly?
> >
> > Please guide.
> >
> > Thanks,
> > Atita
> >
>


Re: Solr Search Special Characters

2018-09-27 Thread Atita Arora
Hi Piyush,

This sounds like an encoding problem.
Can you try q= Tata%20%26%20Sons ?

I believe for '&' you can use %26 in your query. (refer to
https://meyerweb.com , encode your queries and try them if they work as
expected)
You can you also try debug=true to see what query is actually sent.

I am sure you must have checked =>
http://lucene.apache.org/core/6_5_1/queryparser/org/apache/lucene/queryparser/classic/package-summary.html#Escaping_Special_Characters
too.

Atita



On Thu, Sep 27, 2018 at 6:39 AM Rathor, Piyush (US - Philadelphia) <
prat...@deloitte.com> wrote:

> Hi All,
>
>
>
> We are facing some issues in search with special characters. Can you
> please help in query if the search is done using following characters:
>
> • “&”
>
>Example – Tata & Sons
>
> • AND
>
>Example – Tata AND Sons
>
> • (
>
>Example – People (Pvt) Ltd
>
> • )
>
>Example – People (Pvt) Ltd
>
>
>
>
>
> Thanks & Regards
>
> Piyush Rathor
>
> Consultant
>
> Deloitte Digital (Salesforce.com / Force.com)
>
> Deloitte Consulting Pvt. Ltd.
>
> Office: +1 (615) 209 4980
>
> Mobile : +1 (302) 397 1491
>
> prat...@deloitte.com | www.deloitte.com
>
>
>
> Please consider the environment before printing.
>
>
>
> This message (including any attachments) contains confidential information
> intended for a specific individual and purpose, and is protected by law. If
> you are not the intended recipient, you should delete this message and any
> disclosure, copying, or distribution of this message, or the taking of any
> action based on it, by you is strictly prohibited.
>
> v.E.1
>


SPLITSHARD throwing OutOfMemory Error

2018-09-25 Thread Atita Arora
Hi,

I am working on a test setup with Solr 6.1.0 cloud with 1 collection
sharded across 2 shards with no replication. When triggered a SPLITSHARD
command it throws "java.lang.OutOfMemoryError: Java heap space" everytime.
I tried this with multiple heap settings of 8, 12 & 20G but every time it
does create 2 sub-shards but then fails eventually.
I know the issue => https://jira.apache.org/jira/browse/SOLR-5214 has been
resolved but the trace looked very similar to this one.
Also just to ensure that I do not run into exceptions due to merge as
reported in this ticket, I also tried running optimize before proceeding
with splitting the shard.
I issued the following commands :

1.
http://localhost:8983/solr/admin/collections?collection=testcollection=shard1=SPLITSHARD

This threw java.lang.OutOfMemoryError: Java heap space

2.
http://localhost:8983/solr/admin/collections?collection=testcollection=shard1=SPLITSHARD=1000

Then I ran with async=1000 and checked the status. Every time It's creating
the sub shards, but not splitting the index.

Is there something that I am not doing correctly?

Please guide.

Thanks,
Atita


CDCR setup with Custom Document Routing

2018-05-18 Thread Atita Arora
Hi,

I am to setup the CDCR for a Solr Cluster which uses Custom Document
Routing.
Has anyone tried that before ?
Do we have any caveats to know well before ?
I will be setting up Uni Directional in Solr 7.3.

Per documentation -

> The current design works most robustly if both the Source and target
> clusters have the same number of shards. There is no requirement that the
> shards in the Source and target collection have the same number of replicas.
> Having different numbers of shards on the Source and target cluster is
> possible, but is also an "expert" configuration as that option imposes
> certain constraints and is not recommended. Most of the scenarios where
> having differing numbers of shards are contemplated are better accomplished
> by hosting multiple shards on each target Solr instance.


I am precisely little curious to know how would this fare if this isn't
followed.
Would highly appreciate any pointers around this.

Sincerely,
Atita


Re: Determine Solr Core Creation Timestamp

2018-05-08 Thread Atita Arora
Thank you Shawn for looking into this to such a depth.
Let me try getting hold of someway to grab this information and use it and
I may reach back to you or list for further thoughts.

Thanks again,
Atita

On Tue, May 8, 2018, 3:11 PM Shawn Heisey <apa...@elyograg.org> wrote:

> On 5/7/2018 3:50 PM, Atita Arora wrote:
> > I noticed the same and hence overruled the idea to use it.
> > Further , while exploring the V2 api (as we're currently in Solr 6.6 and
> > will soon be on Solr 7.X) ,I came across the shards API which has
> > "property.index.version": "1525453818563"
> >
> > Which is listed for each of the shards. I wonder if I should be
> leveraging
> > this as this seem to be the index version & I dont think this number
> should
> > vary on restart.
>
> The index version is a number that is milliseconds since the epoch --
> 1970-01-01 00:00:00 UTC.  This is how Java represents timestamps
> internally.  All Lucene indexes have this information.
>
> The index version value appears to update every time the index changes,
> probably when a new searcher is opened.
>
> For SolrCloud collections, this information is actually already
> available, although getting to it may not be obvious.  ZooKeeeper itself
> keeps track of when all znodes are created, so the /collections/x
> znode creation time is effectively what you're after.  This can be seen
> in Cloud->Tree in the admin UI, which means that there is a way to
> obtain the information with an HTTP API.
>
> When cores are created or manipulated by API calls, the core.properties
> file will have a comment with a timestamp of the last time Solr
> wrote/changed the file.  CoreAdmin operations like CREATE, SWAP, RENAME,
> and others will update or create the timestamp in that comment, but if
> the properties file doesn't ever get changed by Solr, then the comment
> would reflect the creation time.  That makes it not entirely reliable.
> Also, I do not know of a way to access that information with any Solr
> API -- access to the filesystem would probably be required.
>
> The core.properties file could be a place to store a true creation time,
> using a new property that Solr doesn't need for any other purpose.  Solr
> could look for a creation time in that file when the core is started and
> update it to include the current time as the creation time if it is not
> present, and certain CoreAdmin operations could also write that
> property.  Retrieving the value would needed to be added to the
> CoreAdmin API.
>
> Thanks,
> Shawn
>
>


Re: Determine Solr Core Creation Timestamp

2018-05-07 Thread Atita Arora
Hi Shawn,

I noticed the same and hence overruled the idea to use it.
Further , while exploring the V2 api (as we're currently in Solr 6.6 and
will soon be on Solr 7.X) ,I came across the shards API which has
"property.index.version": "1525453818563"

Which is listed for each of the shards. I wonder if I should be leveraging
this as this seem to be the index version & I dont think this number should
vary on restart.

Any pointers ?

Thanks,
Atita


On Mon, May 7, 2018 at 11:16 AM, Shawn Heisey <apa...@elyograg.org> wrote:

> On 5/6/2018 3:09 PM, Atita Arora wrote:
>
>> I am working on a developing a utility which lets one monitor the
>> indexPipeline Status.
>> The indexing job runs in two forms where either it -
>> 1. Creates a new core OR
>> 2. Runs the delta on existing core.
>> To put down to simplest form I look into the DB timestamp when the
>> indexing
>> job was triggered and have a desire to read some stat / metric from Solr
>> (preferably an API) which reports a timestamp when the CORE was created /
>> modified.
>> My utility completely relies on the difference between timestamps from DB
>> &
>> Solr as these two timestamps are leveraged to determine health of
>> pipeline.
>>
>> I see the Master Version Timestamp under each shard which details the
>> version / Gen / Size.
>> Is that what I should be using ? How can I grab these from API ?
>> I tried using metrics api :
>> *http://localhost:8983/solr/admin/metrics?group=core=CORE
>> <http://localhost:8983/solr/admin/metrics?group=core=CORE>*
>> which details *CORE.startTime *but this timestamp changes whenever data is
>> being added to any core on this node.
>> *Is there any other suggestion to use some other way to determine the core
>> creation timestamp* ?
>>
>
> The startTime value is the time at which Solr started the core.  If that
> is getting updated frequently, then a reload operation is probably
> happening on the core.  Or, less likely, the Solr instance has been
> restarted.  I have checked a 6.6 system and on a core that is getting
> updates as frequently as once a minute, startTime is a couple of days ago,
> which was the last time that core was reloaded.
>
> I've been trying to figure out whether a Lucene index keeps track of the
> time it was created, but I haven't found anything yet.  If it doesn't, I do
> wonder whether there might be some kind of metadata that Solr could write
> to the index to record information like this.  Solr would always have the
> option of writing such metadata to an entirely different location within
> the instanceDir.  The index creation time is probably not the only
> information that would be useful to have available.
>
> Thanks,
> Shawn
>
>


Determine Solr Core Creation Timestamp

2018-05-06 Thread Atita Arora
Hi,

I am working on a developing a utility which lets one monitor the
indexPipeline Status.
The indexing job runs in two forms where either it -
1. Creates a new core OR
2. Runs the delta on existing core.
To put down to simplest form I look into the DB timestamp when the indexing
job was triggered and have a desire to read some stat / metric from Solr
(preferably an API) which reports a timestamp when the CORE was created /
modified.
My utility completely relies on the difference between timestamps from DB &
Solr as these two timestamps are leveraged to determine health of pipeline.

I see the Master Version Timestamp under each shard which details the
version / Gen / Size.
Is that what I should be using ? How can I grab these from API ?
I tried using metrics api :
*http://localhost:8983/solr/admin/metrics?group=core=CORE
*
which details *CORE.startTime *but this timestamp changes whenever data is
being added to any core on this node.
*Is there any other suggestion to use some other way to determine the core
creation timestamp* ?

Please help !

Thanks,
Atita


Re: Upgrading a Plugin from 6.6 to 7.x

2018-03-21 Thread Atita Arora
Hi Peter,


*(Sorry for the earlier incomplete email - I hit send by mistake)*

I haven't really been able to look into it completely , but my first glance
says , it should be because the method signature has changed.

Iam looking here : https://lucene.apache.org/core/7_0_0/core/org/apache/
lucene/search/Query.html

createWeight
<https://lucene.apache.org/core/7_0_0/core/org/apache/lucene/search/Query.html#createWeight-org.apache.lucene.search.IndexSearcher-boolean-float->
(IndexSearcher
<https://lucene.apache.org/core/7_0_0/core/org/apache/lucene/search/IndexSearcher.html>
 searcher, boolean needsScores, float boost)
Expert: Constructs an appropriate Weight implementation for this query.

While at :

https://lucene.apache.org/core/6_6_0/core/org/apache/lucene/search/Query.html


createWeight
<https://lucene.apache.org/core/6_6_0/core/org/apache/lucene/search/Query.html#createWeight-org.apache.lucene.search.IndexSearcher-boolean->
(IndexSearcher
<https://lucene.apache.org/core/6_6_0/core/org/apache/lucene/search/IndexSearcher.html>
searcher,
boolean needsScores)
Expert: Constructs an appropriate Weight implementation for this query.

You would need a code change for this to make it work in Version 7.

Thanks,
Atita


On Wed, Mar 21, 2018 at 6:59 PM, Atita Arora <atitaar...@gmail.com> wrote:

> Hi Peter,
>
> I haven't really been able to look into it completely , but my first
> glance says , it should be because the method signature has changed.
>
> Iam looking here : https://lucene.apache.org/core/7_0_0/core/org/apache/
> lucene/search/Query.html
>
> createWeight
> <https://lucene.apache.org/core/7_0_0/core/org/apache/lucene/search/Query.html#createWeight-org.apache.lucene.search.IndexSearcher-boolean-float->
> (IndexSearcher
> <https://lucene.apache.org/core/7_0_0/core/org/apache/lucene/search/IndexSearcher.html>
>  searcher, boolean needsScores, float boost)
> Expert: Constructs an appropriate Weight implementation for this query.
>
> While at :
>
>
> On Wed, Mar 21, 2018 at 4:16 PM, Peter Alexander Kopciak <pe...@kopciak.at
> > wrote:
>
>> Hi!
>>
>> I'm still pretty new to Solr and I want to use the vector Scoring plugin (
>> https://github.com/saaay71/solr-vector-scoring/network) but
>> unfortunately,
>> it does not seem to work for newer Solr versions.
>>
>> I tested it with 6.6 to verify its functionality, so it seems to be broken
>> because of the upgrade to 7.x.
>>
>> When following the installation procedure and executing the examples, I
>> ran
>> into the following error with Query 1:
>>
>> java.lang.UnsupportedOperationException: Query {! type=vp f=vector
>> vector=0.1,4.75,0.3,1.2,0.7,4.0 v=} does not implement createWeight
>>
>> Does anyone has a lead for me how to fix/upgrade the plugin? The
>> createWeight method seems to exist, so I'm not sure where to start and
>> waht
>> the problem seems to be.
>>
>
>


Re: Upgrading a Plugin from 6.6 to 7.x

2018-03-21 Thread Atita Arora
Hi Peter,

I haven't really been able to look into it completely , but my first glance
says , it should be because the method signature has changed.

Iam looking here :
https://lucene.apache.org/core/7_0_0/core/org/apache/lucene/search/Query.html

createWeight

(IndexSearcher

searcher,
boolean needsScores, float boost)
Expert: Constructs an appropriate Weight implementation for this query.

While at :


On Wed, Mar 21, 2018 at 4:16 PM, Peter Alexander Kopciak 
wrote:

> Hi!
>
> I'm still pretty new to Solr and I want to use the vector Scoring plugin (
> https://github.com/saaay71/solr-vector-scoring/network) but unfortunately,
> it does not seem to work for newer Solr versions.
>
> I tested it with 6.6 to verify its functionality, so it seems to be broken
> because of the upgrade to 7.x.
>
> When following the installation procedure and executing the examples, I ran
> into the following error with Query 1:
>
> java.lang.UnsupportedOperationException: Query {! type=vp f=vector
> vector=0.1,4.75,0.3,1.2,0.7,4.0 v=} does not implement createWeight
>
> Does anyone has a lead for me how to fix/upgrade the plugin? The
> createWeight method seems to exist, so I'm not sure where to start and waht
> the problem seems to be.
>


Re: How to store files larger than zNode limit

2018-03-14 Thread Atita Arora
Thank you Markus , that's kind of relief to know !

Rick,
I spent few minutes looking about puppet/ansible as I have not used them
before, but this seems kind of doable.
Let me give this a try and I'll let you know.
Thanks,
Atita

On Wed, Mar 14, 2018 at 5:01 PM, Rick Leir  wrote:

> Could you manage userdict using Puppet or Ansible? Or whatever your
> automation system is.
> --
> Sorry for being brief. Alternate email is rickleir at yahoo dot com
>


How to store files larger than zNode limit

2018-03-13 Thread Atita Arora
Hi ,

I have a use case supporting multiple clients and multiple languages in a
single application.
So , In order to improve the language support, we want to leverage the Solr
dictionary (userdict.txt) files as large as 10MB.
I understand that ZooKeeper's default zNode file size limit is 1MB.
I'm not sure sure if someone tried increasing it before and how does that
fares in terms of performance.
Looking at - https://zookeeper.apache.org/doc/r3.2.2/zookeeperAdmin.html
It states -
Unsafe Options

The following options can be useful, but be careful when you use them. The
risk of each is explained along with the explanation of what the variable
does.
jute.maxbuffer:

(Java system property:* jute.maxbuffer*)

This option can only be set as a Java system property. There is no
zookeeper prefix on it. It specifies the maximum size of the data that can
be stored in a znode. The default is 0xf, or just under 1M. If this
option is changed, the system property must be set on all servers and
clients otherwise problems will arise. This is really a sanity check.
ZooKeeper is designed to store data on the order of kilobytes in size.
I would appreciate if someone has any suggestions  on what are the best
practices for handling large config/dictionary files in ZK?

Thanks ,
Atita


Re: CLUSTERSTATUS API and Error loading specified collection / config in Solr 5.3.2.

2018-03-13 Thread Atita Arora
Hi Hendrik and Shalin,

Really appreciate your valuable inputs on this.

I looked up to the two issues that were being referred to (SOLR-8804 and
SOLR-10720) and thats exactly what I'm running into.
Glad they have been fixed in later versions.

Thanks much ,
Atita

On Tue, Mar 13, 2018 at 10:38 AM, Shalin Shekhar Mangar <
shalinman...@gmail.com> wrote:

> I think you are running into race conditions in the API which have been
> fixed. See SOLR-8804 and SOLR-10720. The first is available in 5.5.1 but
> the latter fix will be released in the upcoming 7.3 release. The best
> workaround for your version is to just retry a few times until the API
> succeeds.
>
> On Sun, Mar 11, 2018 at 11:57 PM, Atita Arora <atitaar...@gmail.com>
> wrote:
>
> > Hi ,
> >
> > I am working on an application which involves working on a highly
> > distributed Solr cloud environment. The application supports
> multi-tenancy
> > and we have around 250-300 collections on Solr where each client has
> their
> > own collection with a new shard being created as
> clientid-
> > where the timestamp is whenever the new data comes in for the client
> > (typically every 4-8 hrs) , the reason for this convention is to make
> sure
> > when the Indexes are being built (on demand) the timestamp matches
> closely
> > to the time when the last indexing was run (the earlier shard is
> > de-provisioned as soon as the new one is created). Whenever the indexing
> is
> > triggered it first makes a DB entry and then creates a catalog with
> > timestamp in solr.
> > The Solr cloud has 10 Nodes distributed geographically among 10
> > datacenters.
> > The replication factor is 2. The Solr version is 5.3.2.
> > Coming to my problem - I had to write a utility to ensure that the DB
> > insert timestamp matches closely to the Solr index timestamp wherein I
> can
> > ensure that if the difference between DB timestamp and Solr Index
> tinestamp
> > is <= 2 hrs , we have fresh index. The new index contains revised prices
> of
> > products or offers etc which are critical to be updated as in when they
> > come. Hence this utility is to track that the required updates have been
> > successfully made.
> > I used *CLUSTERSTATUS* api for this task. It is serving the purpose well
> so
> > far , but pretty recently our solr cloud started complaining of strange
> > things because of which the *CLUSTERSTATUS* api keeps returning as error.
> >
> > The error claims to be of missing config & sometime missing collections
> > like.
> >
> > org.apache.solr.common.SolrException: Could not find collection :
> > > 1785-1520548816454
> >
> > org.apache.solr.common.SolrException: Could not find collection :
> > 1785-1520548816454
> > at
> > org.apache.solr.common.cloud.ClusterState.getCollection(
> > ClusterState.java:165)
> > at
> > org.apache.solr.handler.admin.ClusterStatus.getClusterStatus(
> > ClusterStatus.java:110)
> > at
> > org.apache.solr.handler.admin.CollectionsHandler$
> > CollectionOperation$19.call(CollectionsHandler.java:614)
> > at
> > org.apache.solr.handler.admin.CollectionsHandler.handleRequestBody(
> > CollectionsHandler.java:166)
> > at
> > org.apache.solr.handler.RequestHandlerBase.handleRequest(
> > RequestHandlerBase.java:143)
> > at
> > org.apache.solr.servlet.HttpSolrCall.handleAdminRequest(
> > HttpSolrCall.java:678)
> > at org.apache.solr.servlet.HttpSolrCall.call(HttpSolrCall.java:444)
> > at
> > org.apache.solr.servlet.SolrDispatchFilter.doFilter(
> > SolrDispatchFilter.java:215)
> > at
> > org.apache.solr.servlet.SolrDispatchFilter.doFilter(
> > SolrDispatchFilter.java:179)
> > at
> > org.eclipse.jetty.servlet.ServletHandler$CachedChain.
> > doFilter(ServletHandler.java:1652)
> > at
> > org.eclipse.jetty.servlet.ServletHandler.doHandle(
> ServletHandler.java:585)
> >
> > The other times it would complain of missing the config for same or
> > different client id- timestamp like :
> >
> > 1532-1518669619526_shard1_replica3:
> > org.apache.solr.common.cloud.ZooKeeperException:org.apache.
> > solr.common.cloud.ZooKeeperException:
> > Specified config does not exist in ZooKeeper:1532-1518669619526I
> >
> > I would really appreciate if :
> >
> >
> >1. Someone can possibly guide me as to whats going on Solr Cloud
> >2. If CLUSTERSTATUS is the right pick to build such utility. Do we
> > have any other option?
> >
> >
> > Thanks for any pointers and suggestions.
> >
> > Appreciate your attention looking this through.
> >
> > Atita
> >
>
>
>
> --
> Regards,
> Shalin Shekhar Mangar.
>


CLUSTERSTATUS API and Error loading specified collection / config in Solr 5.3.2.

2018-03-11 Thread Atita Arora
Hi ,

I am working on an application which involves working on a highly
distributed Solr cloud environment. The application supports multi-tenancy
and we have around 250-300 collections on Solr where each client has their
own collection with a new shard being created as clientid-
where the timestamp is whenever the new data comes in for the client
(typically every 4-8 hrs) , the reason for this convention is to make sure
when the Indexes are being built (on demand) the timestamp matches closely
to the time when the last indexing was run (the earlier shard is
de-provisioned as soon as the new one is created). Whenever the indexing is
triggered it first makes a DB entry and then creates a catalog with
timestamp in solr.
The Solr cloud has 10 Nodes distributed geographically among 10 datacenters.
The replication factor is 2. The Solr version is 5.3.2.
Coming to my problem - I had to write a utility to ensure that the DB
insert timestamp matches closely to the Solr index timestamp wherein I can
ensure that if the difference between DB timestamp and Solr Index tinestamp
is <= 2 hrs , we have fresh index. The new index contains revised prices of
products or offers etc which are critical to be updated as in when they
come. Hence this utility is to track that the required updates have been
successfully made.
I used *CLUSTERSTATUS* api for this task. It is serving the purpose well so
far , but pretty recently our solr cloud started complaining of strange
things because of which the *CLUSTERSTATUS* api keeps returning as error.

The error claims to be of missing config & sometime missing collections
like.

org.apache.solr.common.SolrException: Could not find collection :
> 1785-1520548816454

org.apache.solr.common.SolrException: Could not find collection :
1785-1520548816454
at
org.apache.solr.common.cloud.ClusterState.getCollection(ClusterState.java:165)
at
org.apache.solr.handler.admin.ClusterStatus.getClusterStatus(ClusterStatus.java:110)
at
org.apache.solr.handler.admin.CollectionsHandler$CollectionOperation$19.call(CollectionsHandler.java:614)
at
org.apache.solr.handler.admin.CollectionsHandler.handleRequestBody(CollectionsHandler.java:166)
at
org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:143)
at
org.apache.solr.servlet.HttpSolrCall.handleAdminRequest(HttpSolrCall.java:678)
at org.apache.solr.servlet.HttpSolrCall.call(HttpSolrCall.java:444)
at
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:215)
at
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:179)
at
org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1652)
at
org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:585)

The other times it would complain of missing the config for same or
different client id- timestamp like :

1532-1518669619526_shard1_replica3:
org.apache.solr.common.cloud.ZooKeeperException:org.apache.solr.common.cloud.ZooKeeperException:
Specified config does not exist in ZooKeeper:1532-1518669619526I

I would really appreciate if :


   1. Someone can possibly guide me as to whats going on Solr Cloud
   2. If CLUSTERSTATUS is the right pick to build such utility. Do we
have any other option?


Thanks for any pointers and suggestions.

Appreciate your attention looking this through.

Atita


Re: Solr Basic Authentication setup issue (password SolrRocks not accepted) on Solr6.1.0/Zkp3.4.6

2018-02-23 Thread Atita Arora
Hi,

I tried the same on version 7.0.1 and it works with the same json.
However , I remember setting this up for another client who used the same
version and they reported similar issues.
They later planned an upgrade to resolve this.

I would also advice you to look into SOLR-9188
 &  SOLR-9640
.
The internode communication is a buggy feature as far as I believe in
BasicAuth Solr V6.1 which eventually got fixed in later versions.

Thanks,
Atita


On Fri, Feb 23, 2018 at 1:25 PM, Tarjono, C. A. 
wrote:

> Dear All,
>
>
>
> We are trying to implement basic authentication in our solrcloud
> implementation. We followed the PDF (for version 6.1.0) as below:
>
>1. Start Solr
>2. Created security.json
>
> {
>
> "authentication":{
>
> "blockUnknown": true,
>
> "class":"solr.BasicAuthPlugin",
>
> "credentials":{"solr":"IV0EHq1OnNrj6gvRCwvFwTrZ1+
> z1oBbnQdiVC3otuq0=Ndd7LKvVBAaZIF0QAVi1ekCfAJXr1GGfLtRUXhgrF8c="}
>
> },
>
> "authorization":{
>
> "class":"solr.RuleBasedAuthorizationPlugin",
>
> "permissions":[{"name":"security-edit",
> "role":"admin"}],
>
> "user-role":{"solr":"admin"}
>
> }
>
> }
>
>1. Uploaded the new security.json with below command
>
> # ./zkcli.sh -zkhost localhost:2181 -cmd putfile /security.json
> /u02/solr/setup/security.json
>
>1. Open up the solr admin page and prompted with authentication
>2. We try inputting username “solr” and password “SolrRocks” but it
>will not authenticate.
>
>
>
>
>
> From what I understand, that username/password combination is the default
> that will have to be changed later. Any ideas why it is not working?
>
> We tried to check for special characters in the encrypted password, there
> was none. For now we are removing the flag “blockUnknown” as a workaround.
>
>
>
> We are using SolrCloud 6.1.0 and Zookeeper 3.4.6 (ensamble) in our setup.
> Appreciate the input.
>
>
>
>
>
> Best Regards,
>
>
>
> Christopher Tarjono
>
> *Accenture Pte Ltd*
>
>
>
> +65 9347 2484
>
> c.a.tarj...@accenture.com
>
>
>
> --
>
> This message is for the designated recipient only and may contain
> privileged, proprietary, or otherwise confidential information. If you have
> received it in error, please notify the sender immediately and delete the
> original. Any other use of the e-mail by you is prohibited. Where allowed
> by local law, electronic communications with Accenture and its affiliates,
> including e-mail and instant messaging (including content), may be scanned
> by our systems for the purposes of information security and assessment of
> internal compliance with Accenture policy.
> 
> __
>
> www.accenture.com
>


Re: Request node status independently

2018-02-01 Thread Atita Arora
Hi Erick,

Just as you mentioned about clusterstatus, I am using the same for almost
the similar Usecase. The only issue I run into is that I need some way I
could use prefix with collection param, is there some way to do that? So
that I  can  query the specific collection of my interest.
Note : My collection names are prefixed with clientId.

Thanks,
Atita

On Feb 1, 2018 10:06 PM, "Erick Erickson"  wrote:

> The Collections API CLUSTERSTATUS essentially gives you back the ZK
> state.json for individual collections (or your cluster, see the
> params). One note: Just because the state.json reports a replica as
> "active" isn't definitive. If the node died unexpectedly its replicas
> can't set the state when shutting down. So you also have to check
> whether the replica's node is in the "live_nodes" znode.
>
> Best,
> Erick
>
> On Thu, Feb 1, 2018 at 4:34 AM, Daniel Carrasco 
> wrote:
> > Hello,
> >
> > I'm trying to create a load balancer using HAProxy to detect nodes that
> are
> > down or recovering, but I'm not able to find the way to detect if the
> node
> > is healthy (the only commands i've seen check the entire cluster).
> > Is there any way to check the node status using http responses and get
> only
> > if is healthy or recovering?. Of course if is dead I've got no response,
> so
> > that's easy.
> >
> > Thanks and greetings!!
> >
> > --
> > _
> >
> >   Daniel Carrasco Marín
> >   Ingeniería para la Innovación i2TIC, S.L.
> >   Tlf:  +34 911 12 32 84 Ext: 223
> >   www.i2tic.com
> > _
>


Re: Custom Solr function

2018-01-30 Thread Atita Arora
Hope this helps -
https://dzone.com/articles/how-write-custom-solr

On Tue, Jan 30, 2018 at 2:06 PM, LOPEZ-CORTES Mariano-ext <
mariano.lopez-cortes-...@pole-emploi.fr> wrote:

> Can we create a custom function in Java?
>
> Example :
>
> sort = func([USER-ENTERED TEXT]) desc
>
> func returns will numeric value
>
> Thanks in advance
>


Re: Using lucene to post-process Solr query results

2018-01-23 Thread Atita Arora
Hi Rahul,
Looks like Streaming expressions can probably can help you.

Is there something else you have tried for this?

Atita



On Jan 23, 2018 3:24 PM, "Rahul Chhiber" 
wrote:

Hi All,

For our business requirement, once our Solr client (Java) gets the results
of a search query from the Solr server, we need to further search across
and also within the content of the returned documents. To accomplish this,
I am attempting to create on the client-side an in-memory lucene index
(RAMDirectory), convert the SolrDocument objects into smaller lucene
Document objects, add them into the index and then search within it.

Has something like this been attempted yet? And does it sound like a
workable idea ?

P.S. - Reason for this approach is basically that we need search on the
data at a certain fine granularity but don't want to index the data at such
high granularity for indexing performance reasons i.e. we need to keep the
total number of documents small.

Appreciate any help.

Regards,
Rahul Chhiber


Re: Solr Exception: Undefined Field

2018-01-18 Thread Atita Arora
Hi Deepak,

I am wondering if we can take a look at your schema and if you can tell
which version of solr are you on,would help us to debug and recreate issue
at our local box.

Atita


On Jan 18, 2018 1:44 PM, "Deepak Goel"  wrote:

> Hello
>
> In Solr Admin: I type the q parameter as -
>
> text_entry:*
>
> It gives the following exception (In the schema I do see a field as
> text_entry):
>
> {
> "responseHeader":{
> "zkConnected":true,
> "status":400,
> "QTime":2,
> "params":{
> "q":"text_entry:*",
> "_":"1516190134181"}},
> "error":{
> "metadata":[
> "error-class","org.apache.solr.common.SolrException",
> "root-error-class","org.apache.solr.common.SolrException"],
> "msg":"undefined field text_entry",
> "code":400}}
>
>
> However when i type the q paramter as -
>
> {!term f=text_entry}henry
>
> This does give out the output as foll:
>
> {
> "responseHeader":{
> "zkConnected":true,
> "status":0,
> "QTime":0,
> "params":{
> "q":"{!term f=text_entry}henry",
> "_":"1516190134181"}},
> "response":{"numFound":262,"start":0,"docs":[
> {
> "type":"line",
> "line_id":"80075",
> "play_name":"Richard II",
> "speech_number":"13",
> "line_number":"3.3.37",
> "speaker":"HENRY BOLINGBROKE",
> "text_entry":"Henry Bolingbroke",
> "id":"9428c765-a4e8-4116-937a-9b70e8a8e2de",
> "_version_":1588569205789163522,
> "speaker_str":["HENRY BOLINGBROKE"],
> "text_entry_str":["Henry Bolingbroke"],
> "line_number_str":["3.3.37"],
> "type_str":["line"],
> "play_name_str":["Richard II"]},
> {
> 
>
> Any ideas what is going wrong in the first q?
>
> Thank You
>
> On 1/18/18, Rick Leir  wrote:
> > Deepak
> > Would you like to write your post again without asterisks? Include the
> > asterisks which are necessary to the query of course.
> > Rick
> >
> > On January 17, 2018 1:10:28 PM EST, Deepak Goel 
> wrote:
> >>*Hello*
> >>
> >>*In Solr Admin: I type the q parameter as - *
> >>
> >>*text_entry:**
> >>
> >>*It gives the following exception (In the schema I do see a field as
> >>text_entry):*
> >>
> >>{ "responseHeader":{ "zkConnected":true, "status":400, "QTime":2,
> >>"params":{
> >>"q":"text_entry:*", "_":"1516190134181"}}, "error":{ "metadata":[
> >>"error-class","org.apache.solr.common.SolrException",
> >>"root-error-class",
> >>"org.apache.solr.common.SolrException"], "msg":"undefined field
> >>text_entry",
> >>"code":400}}
> >>
> >>
> >>*However when i type the q paramter as -*
> >>
> >>*{!term f=text_entry}henry*
> >>
> >>*This does give out the output as foll:*
> >>
> >>{ "responseHeader":{ "zkConnected":true, "status":0, "QTime":0,
> >>"params":{ "
> >>q":"{!term f=text_entry}henry", "_":"1516190134181"}},
> >>"response":{"numFound
> >>":262,"start":0,"docs":[ { "type":"line", "line_id":"80075",
> >>"play_name":"Richard
> >>II", "speech_number":"13", "line_number":"3.3.37", "speaker":"HENRY
> >>BOLINGBROKE", "text_entry":"Henry Bolingbroke", "id":
> >>"9428c765-a4e8-4116-937a-9b70e8a8e2de",
> >>"_version_":1588569205789163522, "
> >>speaker_str":["HENRY BOLINGBROKE"], "text_entry_str":["Henry
> >>Bolingbroke"],
> >>"line_number_str":["3.3.37"], "type_str":["line"],
> >>"play_name_str":["Richard
> >>II"]}, {
> >>**
> >>
> >>Any ideas what is going wrong in the first q?
> >>
> >>Thank You
> >>
> >>Deepak
> >>"Please stop cruelty to Animals, help by becoming a Vegan"
> >>+91 73500 12833
> >>deic...@gmail.com
> >>
> >>Facebook: https://www.facebook.com/deicool
> >>LinkedIn: www.linkedin.com/in/deicool
> >>
> >>"Plant a Tree, Go Green"
> >>
> >> source=link_campaign=sig-email_content=webmail>
> >>Virus-free.
> >>www.avg.com
> >> source=link_campaign=sig-email_content=webmail>
> >><#m_-480358672325756571_m_-3347175065213108175_DAB4FAD8-2D
> D7-40BB-A1B8-4E2AA1F9FDF2>
> >
> > --
> > Sorry for being brief. Alternate email is rickleir at yahoo dot com
> > --
> > Sorry for being brief. Alternate email is rickleir at yahoo dot com
>
>
> --
>
>
> Deepak
> "Please stop cruelty to Animals, help by becoming a Vegan"
> +91 73500 12833
> deic...@gmail.com
>
> Facebook: https://www.facebook.com/deicool
> LinkedIn: www.linkedin.com/in/deicool
>
> "Plant a Tree, Go Green"
>


Re: How to implement the function of W/N in Solr?

2018-01-15 Thread Atita Arora
Did you give Proximity Search a try ?


On Mon, Jan 15, 2018 at 1:34 PM, xizhen.w...@incoshare.com <
xizhen.w...@incoshare.com> wrote:

> Hello,
>
> I'm using Solr 4.10.3, and I want "A" and "B" are together, "C" and "D"
> are together, and the terms "B" and "C" are no more than 3 terms away from
> each other, by using {!surround} 3w("A B", "C D"), but it doesn't work.  Is
> there any other useful way?
>
> Any help is appreciated.
>
>
>
> xizhen.w...@incoshare.com
>


Re: Solr Wildcard Search

2017-11-30 Thread Atita Arora
As Rick raised the most important aspect here , that the phrase is broken
into multiple terms ORed together ,
I believe if the use case requires to perform wildcard search on phrases ,
we would need to store the entire phrase as a single term in the index
which probably is not happening right now and hence are not found when sent
across as phrases.
I tried this on my local Solr 7.1 without phrase this works as expected ,
however as soon as I do phrase search it fails for the reason as i
mentioned above.

Let me know if I can clarify further.

On Thu, Nov 30, 2017 at 6:31 PM, Georgy Nevsky 
wrote:

> I wish to understand if I can do something to get in result term "shipping"
> when search for "shipp*"?
>
> Here field definition:
>  multiValued="false"/>
>
>  positionIncrementGap="100">
>   
> 
>  ignoreCase="true"
> words="lang/stopwords_en.txt"
> />
> 
> 
>  protected="protwords.txt"/>
> 
>   
>
> Anything else can be important? Most configuration parameters are default
> to
> Apache Solr 7.1.0.
>
> In the best we trust
> Georgy Nevsky
>
>
> -Original Message-
> From: Rick Leir [mailto:rl...@leirtech.com]
> Sent: Thursday, November 30, 2017 7:32 AM
> To: solr-user@lucene.apache.org
> Subject: Re: Solr Wildcard Search
>
> George,
> When you get those results it could be due to stemming.
>
> Wildcard processing expands your term to multiple terms, OR'd together. It
> also takes you down a different analysis pathway, as many analysis
> components do not work with multiple terms. Look into the SolrAdmin
> console,
> and use the analysis tab to understand what is going on.
>
> If you still have doubts, tell us more about your config.
> Cheers --Rick
>
>
> On November 30, 2017 7:06:42 AM EST, Georgy Nevsky
>  wrote:
> >Can somebody help me understand how Solr Wildcard Search is working?
> >
> >If I’m doing search for “ship*” term I’m getting in result many
> >strings, like “Shipping Weight”, “Ship From”, “Shipping Calculator”,
> >etc.
> >
> >But if I’m searching for “shipp*” I don’t get any result.
> >
> >
> >
> >In the best we trust
> >
> >Georgy Nevsky
>
> --
> Sorry for being brief. Alternate email is rickleir at yahoo dot com
>


Re: Issue facing with spell text field containing hyphen

2017-11-21 Thread Atita Arora
I was about to suggest the same , Analysis  Panel is the savior in such
cases of doubts.

-Atita

On Tue, Nov 21, 2017 at 7:26 AM, Rick Leir  wrote:

> Chirag
> Look in Sor Admin, the Analysis panel. Put spider-man in the left and
> right text inputs, and see how it gets analysed. Cheers -- Rick
>
> On November 20, 2017 10:00:49 PM EST, Chirag garg 
> wrote:
> >Hi Rick,
> >
> >Actually my spell field also contains text with hyphen i.e. it contains
> >"spider-man" even then also i am not able to search it.
> >
> >Regards,
> >Chirag
> >
> >
> >
> >--
> >Sent from: http://lucene.472066.n3.nabble.com/Solr-User-f472068.html
>
> --
> Sorry for being brief. Alternate email is rickleir at yahoo dot com
>


Re: Anyone have any comments on current solr monitoring favorites?

2017-11-06 Thread Atita Arora
Hi @Daniel ,

What version of Solr are you using ?
We gave prometheus + Jolokia + InfluxDB + Grafana a try , that came out
well.
With Solr 6.6 the metrics are explosed through the /metrics api, but how do
we go about for the earlier versions , please guide ?
Specifically the cache monitoring.

Thanks in advance,
Atita

On Mon, Nov 6, 2017 at 2:19 PM, Daniel Ortega 
wrote:

> Hi Robert,
>
> We use the following stack:
>
> - Prometheus to scrape metrics (https://prometheus.io/)
> - Prometheus node exporter to export "machine metrics" (Disk, network
> usage, etc.) (https://github.com/prometheus/node_exporter)
> - Prometheus JMX exporter to export "Solr metrics" (Cache usage, QPS,
> Response times...) (https://github.com/prometheus/jmx_exporter)
> - Grafana to visualize all the data scrapped by Prometheus (
> https://grafana.com/)
>
> Best regards
> Daniel Ortega
>
> 2017-11-06 20:13 GMT+01:00 Petersen, Robert (Contr) <
> robert.peters...@ftr.com>:
>
> > PS I knew sematext would be required to chime in here!  
> >
> >
> > Is there a non-expiring dev version I could experiment with? I think I
> did
> > sign up for a trial years ago from a different company... I was actually
> > wondering about hooking it up to my personal AWS based solr cloud
> instance.
> >
> >
> > Thanks
> >
> > Robi
> >
> > 
> > From: Emir Arnautović 
> > Sent: Thursday, November 2, 2017 2:05:10 PM
> > To: solr-user@lucene.apache.org
> > Subject: Re: Anyone have any comments on current solr monitoring
> favorites?
> >
> > Hi Robi,
> > Did you try Sematext’s SPM? It provides host, JVM and Solr metrics and
> > more. We use it for monitoring our Solr instances and for consulting.
> >
> > Disclaimer - see signature :)
> >
> > Emir
> > --
> > Monitoring - Log Management - Alerting - Anomaly Detection
> > Solr & Elasticsearch Consulting Support Training - http://sematext.com/
> >
> >
> >
> > > On 2 Nov 2017, at 19:35, Walter Underwood 
> wrote:
> > >
> > > We use New Relic for JVM, CPU, and disk monitoring.
> > >
> > > I tried the built-in metrics support in 6.4, but it just didn’t do what
> > we want. We want rates and percentiles for each request handler. That
> gives
> > us 95th percentile for textbooks suggest or for homework search results
> > page, etc. The Solr metrics didn’t do that. The Jetty metrics didn’t do
> > that.
> > >
> > > We built a dedicated servlet filter that goes in front of the Solr
> > webapp and reports metrics. It has some special hacks to handle some
> weird
> > behavior in SolrJ. A request to the “/srp” handler is sent as
> > “/select?qt=/srp”, so we normalize that.
> > >
> > > The metrics start with the cluster name, the hostname, and the
> > collection. The rest is generated like this:
> > >
> > > URL: GET /solr/textbooks/select?q=foo=/auto
> > > Metric: textbooks.GET./auto
> > >
> > > URL: GET /solr/textbooks/select?q=foo
> > > Metric: textbooks.GET./select
> > >
> > > URL: GET /solr/questions/auto
> > > Metric: questions.GET./auto
> > >
> > > So a full metric for the cluster “solr-cloud” and the host “search01"
> > would look like “solr-cloud.search01.solr.textbooks.GET./auto.m1_rate”.
> > >
> > > We send all that to InfluxDB. We’ve configured a template so that each
> > part of the metric name is mapped to a field, so we can write efficient
> > queries in InfluxQL.
> > >
> > > Metrics are graphed in Grafana. We have dashboards that mix Cloudwatch
> > (for the load balancer) and InfluxDB.
> > >
> > > I’m still working out the kinks in some of the more complicated
> queries,
> > but the data is all there. I also want to expand the servlet filter to
> > report HTTP response codes.
> > >
> > > wunder
> > > Walter Underwood
> > > wun...@wunderwood.org
> > > http://observer.wunderwood.org/  (my blog)
> > >
> > >
> > >> On Nov 2, 2017, at 9:30 AM, Petersen, Robert (Contr) <
> > robert.peters...@ftr.com> wrote:
> > >>
> > >> OK I'm probably going to open a can of worms here...  lol
> > >>
> > >>
> > >> In the old old days I used PSI probe to monitor solr running on tomcat
> > which worked ok on a machine by machine basis.
> > >>
> > >>
> > >> Later I had a grafana dashboard on top of graphite monitoring which
> was
> > really nice looking but kind of complicated to set up.
> > >>
> > >>
> > >> Even later I successfully just dropped in a newrelic java agent which
> > had solr monitors and a dashboard right out of the box, but it costs
> money
> > for the full tamale.
> > >>
> > >>
> > >> For basic JVM health and Solr QPS and time percentiles, does anyone
> > have any favorites or other alternative suggestions?
> > >>
> > >>
> > >> Thanks in advance!
> > >>
> > >> Robi
> > >>
> > >> 
> > >>
> > >> This communication is confidential. Frontier only sends and receives
> > email on the basis of the terms set out at
> http://www.frontier.com/email_
> > disclaimer.
> > >
> >
> >
>


Re: Need help detecting Relatedness in documents

2017-10-26 Thread Atita Arora
Thanks for the suggestion Anshum , appreciate your response..!

I tried using MLT with the field that stores the similarity index of topics
this could be related to.
But this wasn't really accepted as the solution, as this could not resolve
my next stage of the problem where I need to get the effective 'number of
posts' where the topics which were found as related topics as deduced by
MLT were found together.
So I believe MLT leverages these number to orders the returned set
internally.
So the major challenge was to get those numbers too as they are being used
on graph where these number are plotted.

I wonder if there's an alternative way to get it.
Appreciate any further input on this.

Thanks,
Atita

On Thu, Oct 26, 2017 at 11:36 PM, Anshum Gupta <ansh...@apple.com> wrote:

> I would suggest you look at the mlt query parser. That allows you to find
> documents similar to a particular documents, and also allows for specifying
> the field to use for similarity purposes.
>
> https://lucene.apache.org/solr/guide/7_0/other-parsers.
> html#more-like-this-query-parser
>
> -Anshum
>
>
>
> On Oct 26, 2017, at 1:16 AM, Atita Arora <atitaar...@gmail.com> wrote:
>
> Hi ,
>
> We're working with a productr where the idea is to present the users the
> related documents in particular timeseries.
>
> For an overview think about this as an application which picks up top
> trending blogposts "topics" which are picked and ingested from various
> social sites.
> Further , when you look into the topic from the trending list it shows the
> related topics which happen to happen on the blogposts.
> So to mark a related topic they should have occured on a same blogpost , to
> add , more are these number of occurences , more would be the relatedness
> factor.
>
> Complexity is the related topics change on the user defined date spread ,
> which means if x & y were top most related topics in the blogposts made in
> last 30 days ,
> there is an equal possibility that x could be more related to z if the user
> would have wanted to see related topics in last 60 days.
> So the number of days are user defined and they impact the related topics.
>
> For now every blogpost goes in the index as a seperate document and the
> topic extraction happens alongside indexing which extracts the topics from
> the blogposts and stores them in a different collection.
> For this we have lot of duplicates on the index too , for e.g. a topicname
> search  "football" has around 80K documents , all of them are
> topicname="football".
>
> I wonder if someone can help me :
> 1. How to structure the document in such a way the queries could be more
> performant
> 2. Suggest me as to how can we detect the RELATED topics.
>
> Any help on this would be highly appreciated.
>
> Thanks in advance.
>
> Atita
>
>
>


Need help detecting Relatedness in documents

2017-10-26 Thread Atita Arora
Hi ,

We're working with a productr where the idea is to present the users the
related documents in particular timeseries.

For an overview think about this as an application which picks up top
trending blogposts "topics" which are picked and ingested from various
social sites.
Further , when you look into the topic from the trending list it shows the
related topics which happen to happen on the blogposts.
So to mark a related topic they should have occured on a same blogpost , to
add , more are these number of occurences , more would be the relatedness
factor.

Complexity is the related topics change on the user defined date spread ,
which means if x & y were top most related topics in the blogposts made in
last 30 days ,
there is an equal possibility that x could be more related to z if the user
would have wanted to see related topics in last 60 days.
So the number of days are user defined and they impact the related topics.

For now every blogpost goes in the index as a seperate document and the
topic extraction happens alongside indexing which extracts the topics from
the blogposts and stores them in a different collection.
For this we have lot of duplicates on the index too , for e.g. a topicname
search  "football" has around 80K documents , all of them are
topicname="football".

I wonder if someone can help me :
1. How to structure the document in such a way the queries could be more
performant
2. Suggest me as to how can we detect the RELATED topics.

Any help on this would be highly appreciated.

Thanks in advance.

Atita


Re: Need help with Slow Query Logging

2017-10-12 Thread Atita Arora
Indeed , the trouble hasn't got over yet.
So we got
https://issues.apache.org/jira/browse/SOLR-11453

created meantime.

I'll look forward to your updates.

Thanks again ,
Atita

On Thu, Oct 12, 2017 at 2:08 PM, Emir Arnautović <
emir.arnauto...@sematext.com> wrote:

> Hi Atita,
> I did not have time to try it out, but will try to do it over the weekend
> if you are still having troubles with it.
>
> Regards,
> Emir
> --
> Monitoring - Log Management - Alerting - Anomaly Detection
> Solr & Elasticsearch Consulting Support Training - http://sematext.com/
>
>
>
> > On 10 Oct 2017, at 19:59, Atita Arora <atitaar...@gmail.com> wrote:
> >
> > No luck for me , did you give it a try meantime ?
> > M not sure , if I may have missed something , my logs are completely gone
> > after this change.
> >
> > Wondering whats wrong with them.
> >
> > -Atita
> >
> > On Tue, Oct 10, 2017 at 5:58 PM, Atita Arora <atitaar...@gmail.com>
> wrote:
> >
> >> Sure thanks Emir,
> >> Let me give them a quick try and I'll update you.
> >>
> >> Thanks,
> >> Atita
> >>
> >> On Tue, Oct 10, 2017 at 5:28 PM, Emir Arnautović <
> >> emir.arnauto...@sematext.com> wrote:
> >>
> >>> Hi Atita,
> >>> I did not try it, but I think that following could work:
> >>>
> >>>
> >>> #logging queries
> >>> log4j.logger.org.apache.solr.handler.component.
> QueryComponent=WARN,slow
> >>>
> >>> log4j.appender.slow=org.apache.log4j.RollingFileAppender
> >>> log4j.appender.slow.File=${solr.log}/slow.log
> >>> log4j.appender.slow.layout=org.apache.log4j.EnhancedPatternLayout
> >>> log4j.appender.slow.layout.ConversionPattern=%d{-MM-dd
> HH:mm:ss.SSS}
> >>> %-5p (%t) [%X{collection} %X{shard} %X{replica} %X{core}] %c{1.} %m%n
> >>>
> >>> If you want to log all queries, you can change level for query
> component
> >>> to INFO.
> >>>
> >>> HTH,
> >>> Emir
> >>> --
> >>> Monitoring - Log Management - Alerting - Anomaly Detection
> >>> Solr & Elasticsearch Consulting Support Training -
> http://sematext.com/
> >>>
> >>>
> >>>
> >>>> On 10 Oct 2017, at 13:35, Atita Arora <atitaar...@gmail.com> wrote:
> >>>>
> >>>> Hi Emir,
> >>>>
> >>>> So I made few changes to the log4j config , I am able to redirect
> these
> >>>> logs to another file as well.
> >>>> But as these are the WARN logs so I doubt any logs enabled at WARN
> level
> >>>> are going to be redirected here in this new log file.
> >>>> So precisely , I am using Solr 6.1 (in cloud mode) & I have made few
> >>> more
> >>>> changes to the logging levels and components.
> >>>> Please find my log4j at : *https://pastebin.com/uTLAiBE5
> >>>> <https://pastebin.com/uTLAiBE5>*
> >>>>
> >>>> Any help on this will surely be appreciated.
> >>>>
> >>>> Thanks again.
> >>>>
> >>>> Atita
> >>>>
> >>>>
> >>>> On Tue, Oct 10, 2017 at 1:39 PM, Emir Arnautović <
> >>>> emir.arnauto...@sematext.com> wrote:
> >>>>
> >>>>> Hi Atita,
> >>>>> You should definetely go with log4j configuration as anything else
> >>> would
> >>>>> be redoing what log4j can do. You already have
> >>> slowQueryThresholdMillies to
> >>>>> make slow queries log with WARN and you can configure log4j to put
> such
> >>>>> logs (class + level) to a separate file.
> >>>>> This seems like frequent question and not sure why putting logs to
> >>>>> separate file is not a default configuration - maybe it would make
> >>> things
> >>>>> bit more complicated with logs view in admin console…
> >>>>> If get stuck, let me know (+ Solr version) and I’ll play a bit and
> send
> >>>>> you configs.
> >>>>>
> >>>>> HTH,
> >>>>> Emir
> >>>>> --
> >>>>> Monitoring - Log Management - Alerting - Anomaly Detection
> >>>>> Solr & Elasticsearch Consulting Support Training -
> >>> http://sematext.com/
> >>>>>
> >>>>

Re: Need help with Slow Query Logging

2017-10-10 Thread Atita Arora
No luck for me , did you give it a try meantime ?
M not sure , if I may have missed something , my logs are completely gone
after this change.

Wondering whats wrong with them.

-Atita

On Tue, Oct 10, 2017 at 5:58 PM, Atita Arora <atitaar...@gmail.com> wrote:

> Sure thanks Emir,
> Let me give them a quick try and I'll update you.
>
> Thanks,
> Atita
>
> On Tue, Oct 10, 2017 at 5:28 PM, Emir Arnautović <
> emir.arnauto...@sematext.com> wrote:
>
>> Hi Atita,
>> I did not try it, but I think that following could work:
>>
>>
>> #logging queries
>> log4j.logger.org.apache.solr.handler.component.QueryComponent=WARN,slow
>>
>> log4j.appender.slow=org.apache.log4j.RollingFileAppender
>> log4j.appender.slow.File=${solr.log}/slow.log
>> log4j.appender.slow.layout=org.apache.log4j.EnhancedPatternLayout
>> log4j.appender.slow.layout.ConversionPattern=%d{-MM-dd HH:mm:ss.SSS}
>> %-5p (%t) [%X{collection} %X{shard} %X{replica} %X{core}] %c{1.} %m%n
>>
>> If you want to log all queries, you can change level for query component
>> to INFO.
>>
>> HTH,
>> Emir
>> --
>> Monitoring - Log Management - Alerting - Anomaly Detection
>> Solr & Elasticsearch Consulting Support Training - http://sematext.com/
>>
>>
>>
>> > On 10 Oct 2017, at 13:35, Atita Arora <atitaar...@gmail.com> wrote:
>> >
>> > Hi Emir,
>> >
>> > So I made few changes to the log4j config , I am able to redirect these
>> > logs to another file as well.
>> > But as these are the WARN logs so I doubt any logs enabled at WARN level
>> > are going to be redirected here in this new log file.
>> > So precisely , I am using Solr 6.1 (in cloud mode) & I have made few
>> more
>> > changes to the logging levels and components.
>> > Please find my log4j at : *https://pastebin.com/uTLAiBE5
>> > <https://pastebin.com/uTLAiBE5>*
>> >
>> > Any help on this will surely be appreciated.
>> >
>> > Thanks again.
>> >
>> > Atita
>> >
>> >
>> > On Tue, Oct 10, 2017 at 1:39 PM, Emir Arnautović <
>> > emir.arnauto...@sematext.com> wrote:
>> >
>> >> Hi Atita,
>> >> You should definetely go with log4j configuration as anything else
>> would
>> >> be redoing what log4j can do. You already have
>> slowQueryThresholdMillies to
>> >> make slow queries log with WARN and you can configure log4j to put such
>> >> logs (class + level) to a separate file.
>> >> This seems like frequent question and not sure why putting logs to
>> >> separate file is not a default configuration - maybe it would make
>> things
>> >> bit more complicated with logs view in admin console…
>> >> If get stuck, let me know (+ Solr version) and I’ll play a bit and send
>> >> you configs.
>> >>
>> >> HTH,
>> >> Emir
>> >> --
>> >> Monitoring - Log Management - Alerting - Anomaly Detection
>> >> Solr & Elasticsearch Consulting Support Training -
>> http://sematext.com/
>> >>
>> >>
>> >>
>> >>> On 9 Oct 2017, at 16:27, Atita Arora <atitaar...@gmail.com> wrote:
>> >>>
>> >>> Hi ,
>> >>>
>> >>> I have a situation here where I am required to log the slow queries
>> into
>> >> a
>> >>> seperate log file which then can be used for optimization purposes.
>> >>> For now this log is aggregated into the mainstream log marking
>> >>> [slow:..].
>> >>> I looked into the code and the configuration and I am really clueless
>> as
>> >> to
>> >>> how do I go about seperating the slow query logs as it needs another
>> file
>> >>> appender
>> >>> to be created other than the one already present in the log4j.
>> >>> If I create another appender I can do so by degregating through log
>> >> levels
>> >>> , so that moves all the WARN logs to another file (which is not what
>> I am
>> >>> looking for).
>> >>> Also from the code prespective , I feel how about if I introduce
>> another
>> >>> config setting along with the slowQueryThresholdMillis value ,
>> something
>> >>> like
>> >>>
>> >>> slowQueryLogFile = get("query/slowQueryLogFile", logfilepath);
>> >>>
>> >>>
>> >>> where slowQueryLogFile and if present it logs into this file
>> otherwise it
>> >>> works on the already present along with
>> >>>
>> >>> slowQueryThresholdMillis = getInt("query/slowQueryThresholdMillis",
>> -1);
>> >>>
>> >>>
>> >>> or should I tweak log4j ?
>> >>> I am not sure if anyone has done that before or have any pointers to
>> >> guide
>> >>> me on this.
>> >>> Please help.
>> >>>
>> >>> Thanks in advance,
>> >>> Atita
>> >>
>> >>
>>
>>
>


Re: Need help with Slow Query Logging

2017-10-10 Thread Atita Arora
Sure thanks Emir,
Let me give them a quick try and I'll update you.

Thanks,
Atita

On Tue, Oct 10, 2017 at 5:28 PM, Emir Arnautović <
emir.arnauto...@sematext.com> wrote:

> Hi Atita,
> I did not try it, but I think that following could work:
>
>
> #logging queries
> log4j.logger.org.apache.solr.handler.component.QueryComponent=WARN,slow
>
> log4j.appender.slow=org.apache.log4j.RollingFileAppender
> log4j.appender.slow.File=${solr.log}/slow.log
> log4j.appender.slow.layout=org.apache.log4j.EnhancedPatternLayout
> log4j.appender.slow.layout.ConversionPattern=%d{-MM-dd HH:mm:ss.SSS}
> %-5p (%t) [%X{collection} %X{shard} %X{replica} %X{core}] %c{1.} %m%n
>
> If you want to log all queries, you can change level for query component
> to INFO.
>
> HTH,
> Emir
> --
> Monitoring - Log Management - Alerting - Anomaly Detection
> Solr & Elasticsearch Consulting Support Training - http://sematext.com/
>
>
>
> > On 10 Oct 2017, at 13:35, Atita Arora <atitaar...@gmail.com> wrote:
> >
> > Hi Emir,
> >
> > So I made few changes to the log4j config , I am able to redirect these
> > logs to another file as well.
> > But as these are the WARN logs so I doubt any logs enabled at WARN level
> > are going to be redirected here in this new log file.
> > So precisely , I am using Solr 6.1 (in cloud mode) & I have made few more
> > changes to the logging levels and components.
> > Please find my log4j at : *https://pastebin.com/uTLAiBE5
> > <https://pastebin.com/uTLAiBE5>*
> >
> > Any help on this will surely be appreciated.
> >
> > Thanks again.
> >
> > Atita
> >
> >
> > On Tue, Oct 10, 2017 at 1:39 PM, Emir Arnautović <
> > emir.arnauto...@sematext.com> wrote:
> >
> >> Hi Atita,
> >> You should definetely go with log4j configuration as anything else would
> >> be redoing what log4j can do. You already have
> slowQueryThresholdMillies to
> >> make slow queries log with WARN and you can configure log4j to put such
> >> logs (class + level) to a separate file.
> >> This seems like frequent question and not sure why putting logs to
> >> separate file is not a default configuration - maybe it would make
> things
> >> bit more complicated with logs view in admin console…
> >> If get stuck, let me know (+ Solr version) and I’ll play a bit and send
> >> you configs.
> >>
> >> HTH,
> >> Emir
> >> --
> >> Monitoring - Log Management - Alerting - Anomaly Detection
> >> Solr & Elasticsearch Consulting Support Training - http://sematext.com/
> >>
> >>
> >>
> >>> On 9 Oct 2017, at 16:27, Atita Arora <atitaar...@gmail.com> wrote:
> >>>
> >>> Hi ,
> >>>
> >>> I have a situation here where I am required to log the slow queries
> into
> >> a
> >>> seperate log file which then can be used for optimization purposes.
> >>> For now this log is aggregated into the mainstream log marking
> >>> [slow:..].
> >>> I looked into the code and the configuration and I am really clueless
> as
> >> to
> >>> how do I go about seperating the slow query logs as it needs another
> file
> >>> appender
> >>> to be created other than the one already present in the log4j.
> >>> If I create another appender I can do so by degregating through log
> >> levels
> >>> , so that moves all the WARN logs to another file (which is not what I
> am
> >>> looking for).
> >>> Also from the code prespective , I feel how about if I introduce
> another
> >>> config setting along with the slowQueryThresholdMillis value ,
> something
> >>> like
> >>>
> >>> slowQueryLogFile = get("query/slowQueryLogFile", logfilepath);
> >>>
> >>>
> >>> where slowQueryLogFile and if present it logs into this file otherwise
> it
> >>> works on the already present along with
> >>>
> >>> slowQueryThresholdMillis = getInt("query/slowQueryThresholdMillis",
> -1);
> >>>
> >>>
> >>> or should I tweak log4j ?
> >>> I am not sure if anyone has done that before or have any pointers to
> >> guide
> >>> me on this.
> >>> Please help.
> >>>
> >>> Thanks in advance,
> >>> Atita
> >>
> >>
>
>


Re: Need help with Slow Query Logging

2017-10-10 Thread Atita Arora
Hi Emir,

So I made few changes to the log4j config , I am able to redirect these
logs to another file as well.
But as these are the WARN logs so I doubt any logs enabled at WARN level
are going to be redirected here in this new log file.
So precisely , I am using Solr 6.1 (in cloud mode) & I have made few more
changes to the logging levels and components.
Please find my log4j at : *https://pastebin.com/uTLAiBE5
<https://pastebin.com/uTLAiBE5>*

Any help on this will surely be appreciated.

Thanks again.

Atita


On Tue, Oct 10, 2017 at 1:39 PM, Emir Arnautović <
emir.arnauto...@sematext.com> wrote:

> Hi Atita,
> You should definetely go with log4j configuration as anything else would
> be redoing what log4j can do. You already have slowQueryThresholdMillies to
> make slow queries log with WARN and you can configure log4j to put such
> logs (class + level) to a separate file.
> This seems like frequent question and not sure why putting logs to
> separate file is not a default configuration - maybe it would make things
> bit more complicated with logs view in admin console…
> If get stuck, let me know (+ Solr version) and I’ll play a bit and send
> you configs.
>
> HTH,
> Emir
> --
> Monitoring - Log Management - Alerting - Anomaly Detection
> Solr & Elasticsearch Consulting Support Training - http://sematext.com/
>
>
>
> > On 9 Oct 2017, at 16:27, Atita Arora <atitaar...@gmail.com> wrote:
> >
> > Hi ,
> >
> > I have a situation here where I am required to log the slow queries into
> a
> > seperate log file which then can be used for optimization purposes.
> > For now this log is aggregated into the mainstream log marking
> > [slow:..].
> > I looked into the code and the configuration and I am really clueless as
> to
> > how do I go about seperating the slow query logs as it needs another file
> > appender
> > to be created other than the one already present in the log4j.
> > If I create another appender I can do so by degregating through log
> levels
> > , so that moves all the WARN logs to another file (which is not what I am
> > looking for).
> > Also from the code prespective , I feel how about if I introduce another
> > config setting along with the slowQueryThresholdMillis value , something
> > like
> >
> > slowQueryLogFile = get("query/slowQueryLogFile", logfilepath);
> >
> >
> > where slowQueryLogFile and if present it logs into this file otherwise it
> > works on the already present along with
> >
> > slowQueryThresholdMillis = getInt("query/slowQueryThresholdMillis", -1);
> >
> >
> > or should I tweak log4j ?
> > I am not sure if anyone has done that before or have any pointers to
> guide
> > me on this.
> > Please help.
> >
> > Thanks in advance,
> > Atita
>
>


Re: Semantic Knowledge Graph

2017-10-09 Thread Atita Arora
Hi,

Is this the one you're looking for :

https://www.slideshare.net/treygrainger/leveraging-lucenesolr-as-a-knowledge-graph-and-intent-engine

-Atita

On Mon, Oct 9, 2017 at 7:44 PM, David Hastings  wrote:

> Hey All, slides form the 2017 lucene revolution were put up recently, but
> unfortunately, the one I have the most interest in, the semantic knowledge
> graph, have not been put up:
>
> https://lucenesolrrevolution2017.sched.com/event/BAwX/the-
> apache-solr-semantic-knowledge-graph?iframe=no=100%=yes=no
>
>
> dont suppose any one knows where i may be able to find them, or point me in
> a direction to get more information about this tool.
>
> Thanks - dave
>


Need help with Slow Query Logging

2017-10-09 Thread Atita Arora
Hi ,

I have a situation here where I am required to log the slow queries into a
seperate log file which then can be used for optimization purposes.
For now this log is aggregated into the mainstream log marking
[slow:..].
I looked into the code and the configuration and I am really clueless as to
how do I go about seperating the slow query logs as it needs another file
appender
to be created other than the one already present in the log4j.
If I create another appender I can do so by degregating through log levels
, so that moves all the WARN logs to another file (which is not what I am
looking for).
Also from the code prespective , I feel how about if I introduce another
config setting along with the slowQueryThresholdMillis value , something
like

slowQueryLogFile = get("query/slowQueryLogFile", logfilepath);


where slowQueryLogFile and if present it logs into this file otherwise it
works on the already present along with

slowQueryThresholdMillis = getInt("query/slowQueryThresholdMillis", -1);


or should I tweak log4j ?
I am not sure if anyone has done that before or have any pointers to guide
me on this.
Please help.

Thanks in advance,
Atita


Re: Solr 6.5 remote JMX

2017-10-03 Thread Atita Arora
Hi Prashant,

Did you restart the Solr with these options like :

bin/solr start -c -h  -d node2/ -z  -m 3g -s node2/solr -a
"-Dcom.sun.management.jmxremote.port=5556
-Dcom.sun.management.jmxremote.authenticate=false
-Dcom.sun.management.jmxremote.ssl=false
-Dcom.sun.management.jmxremote.host=localhost -Dsun.rmitransport.logLevel=4
-Djava.rmi.server.hostname=" -p 

Please check.

Thanks,
Atita

On Tue, Oct 3, 2017 at 10:51 PM, Satyaprashant Bezwada <
satyaprashant.bezw...@nasdaq.com> wrote:

> I am not able to connect to remote JMX using jconsole on Solr 6.5. The
> Solrconfig.xml does have this XML attribute  that states the default
> MBean. I want to set up JMX so as to monitor the solr nodes. I have solr
> cloud configuration setup. I have set the ENABLE_REMOTE_JMX_OPTS="true" in
> solr.in.sh file. My Solr is set up on a Linux box and I am trying to
> access it from a windows machine. I believe that should not restrict the
> jconsole to access the remote JMX.
>
> I did specify the remote port in solr.in.sh file. Is there something else
> that need to be configured or do I have to specify a custom MBean.
>
> Thanks
> Prashant
>
> ***
> CONFIDENTIALITY NOTICE: This e-mail and any attachments are for the
> exclusive and confidential use of the intended recipient and may constitute
> non-public information. If you received this e-mail in error, disclosing,
> copying, distributing or taking any action in reliance of this e-mail is
> strictly prohibited and may be unlawful. Instead, please notify us
> immediately by return e-mail and promptly delete this message and its
> attachments from your computer system. We do not waive any work product or
> other applicable legal privilege(s) by the transmission of this message.
> ***
>


Re: How to build solr

2017-09-22 Thread Atita Arora
http://www.gingercart.com/Home/search-and-crawl/build-and-run-solr-from-source

and  follow thread

http://lucene.472066.n3.nabble.com/running-solr-in-debug-through-eclipse-td4159777.html

to run solr server in debug mode through eclipse.

Should give you some hint.

Let me go through your error again to see , if I get some clue there.

-Atita

On Fri, Sep 22, 2017 at 11:41 AM, srini sampath  wrote:

> Thanks Aman,
> Erick, I followed the link and I am getting the following error,
>
> Buildfile: ${user.home}\git\lucene-solr\build.xml
>
> compile:
>
> -check-git-state:
>
> -git-cleanroot:
>
> -copy-git-state:
>
> git-autoclean:
>
> resolve:
>
> ivy-availability-check:
>
> BUILD FAILED
> ${user.home}\git\lucene-solr\build.xml:309: The following error occurred
> while executing this line:
> ${user.home}\git\lucene-solr\lucene\build.xml:124: The following error
> occurred while executing this line:
> ${user.home}\git\lucene-solr\lucene\common-build.xml:424:
> ${user.home}\.ant\lib does not exist.
>
> Total time: 0 seconds
>
> Any Idea?
> How can I run solr server In debug mode.
>
> Here is the thing I am trying to do,
> Change a custom plugin called solrTextTagger
> and add some extra query
> parameters to it.
>
> I defined my custom handler in the following way
>
>- class="org.opensextant.solrtexttagger.TaggerRequestHandler">
>
>
>
>- And I defined my custom handler jar file location location in
>solrschema.xml in the following way
>
>   
> (solr-text-tagger.jar
> location)
>
>- I made some changes to the solrTextTagger,
> And built a jar using
>maven.
>- I am running solr as a service. And sending a request using HTTP Post
>method.
>- But the problem is how can I debug solr-text-tagger.jar code to check
>and make changes. (I mean how to do remote debugging?)
>
>
> I am using eclipse IDE for development.
> I found similar problem here
>  in-Solr-Request-Handler-plugin-and-its-debugging-td4077533.html>.
> But I could not understand the solution.
>
> .Best,
> Srini Sampth.
>
>
>
>
>
> On Thu, Sep 21, 2017 at 8:51 PM, Erick Erickson 
> wrote:
>
> > And did you follow the link provided on that page?
> >
> > Best,
> > Erick
> >
> > On Thu, Sep 21, 2017 at 3:07 AM, Aman Tandon 
> > wrote:
> > > Hi Srini,
> > >
> > > Kindly refer to the READ.ME section of this link of GitHub, this
> should
> > > work.
> > > https://github.com/apache/lucene-solr/blob/master/README.md
> > >
> > > With regards,
> > > Aman Tandon
> > >
> > >
> > > On Sep 21, 2017 1:53 PM, "srini sampath" 
> > > wrote:
> > >
> > >> Hi,
> > >> How to build and compile solr in my locale machine? it seems the
> > >> https://wiki.apache.org/solr/HowToCompileSolr page became obsolete.
> > >> Thanks in advance
> > >>
> >
>


Re: AEM SOLR integaration

2017-09-22 Thread Atita Arora
https://www.slideshare.net/DEEPAKKHETAWAT/basics-of-solr-and-solr-integration-with-aem6-61150010

This could probably help too along with the link Nicole shared.

On Fri, Sep 22, 2017 at 12:28 PM, Nicole Bilić 
wrote:

> Hi,
>
> Maybe this could help you out http://www.aemsolrsearch.com/
>
> Regards,
> Nicole
>
> On Sep 22, 2017 05:41, "Gunalan V"  wrote:
>
> > Hello,
> >
> > I'm looking for suggestion in building the SOLR infrastructure so Kindly
> > let me know if anyone has integerated AEM (Adobe Experience Manager) with
> > SOLR?
> >
> >
> >
> > Thanks,
> > GVK
> >
>


Re: query with @ and *

2017-09-14 Thread Atita Arora
Hi,

Can you give us a little information about the query parser you using in
your handler ?

Thanks,
Ati


On Thu, Sep 14, 2017 at 4:36 PM, Mannott, Birgit 
wrote:

> Hi,
>
> I have a problem when searching on email addresses.
> @ seems to be handled as a special character but I don't find anything
> about it in the documentation.
>
> This is my test data
> t...@one.com
> t...@two.com
>
> searching for test* results both, ok.
> searching for t...@one.com results the correct one, ok.
> searching for test results both, what I didn't expect but it's ok.
> searching for test@one* results none and that's the problem.
>
> Escaping the char @ doesn't change it.
> It seems that every query containing @ and * has no result.
>
> Has anyone an idea how to change this?
>
> Thanks,
> Birgit
>
>
>
>
>
>


Re: edismax, pf2 and use of both AND and OR parameter

2017-08-25 Thread Atita Arora
Hi,

I am in the middle of the similar use case as provided , we have three
different fields on UI for searchany, searchall and searchexcept
respectively for OR,AND and NOT query , I need to know how do I make them
work along with Edismax.
We can expect any/all of the fields to have free text.
Any suggestions/guidance is truly appreciated.

Thanks,
Atita

On Wed, Aug 2, 2017 at 2:21 PM, Aman Tandon  wrote:

> Hi,
>
> Ideally it should but from the debug query it seems like it is not
> respecting Boolean clauses.
>
> Anyone else could help here? Is this the ideal behavior?
>
> On Jul 31, 2017 5:47 PM, "Niraj Aswani"  wrote:
>
> > Hi Aman,
> >
> > Thank you very much your reply.
> >
> > Let me elaborate my question a bit more using your example in this case.
> >
> > AFAIK, what the pf2 parameter is doing to the query is adding the
> following
> > phrase queries:
> >
> > (_text_:"system memory") (_text_:"memory oem") (_text_:"oem retail")
> >
> > There are three phrases being checked here:
> > - system memory
> > - memory oem
> > - oem retail
> >
> > However, what I actually expected it to look like is the following:
> > - system memory
> > - memory oem
> > - memory retail
> >
> > My understanding of the edismax parser is that it interprets the AND / OR
> > parameters correctly so it should generate the bi-gram phrases respecting
> > the AND /OR parameters as well, right?
> >
> > Am I missing something here?
> >
> > Regards,
> > Niraj
> >
> > On Mon, Jul 31, 2017 at 4:24 AM, Aman Tandon 
> > wrote:
> >
> > > Hi Niraj,
> > >
> > > Should I expect it to check the following bigram phrases?
> > >
> > > Yes it will check.
> > >
> > > ex- documents & query is given below
> > >
> > > http://localhost:8983/solr/myfile/select?wt=xml=name;
> > > indent=on=*System
> > > AND Memory AND (OEM OR Retail)*=50=json&*qf=_text_=_text_*
> > > =true=edismax
> > >
> > > 
> > > 
> > > 
> > > 
> > > A-DATA V-Series 1GB 184-Pin DDR SDRAM Unbuffered DDR 400 (PC 3200)
> System
> > > Memory - OEM
> > > 
> > > 
> > > 
> > > 
> > > 
> > > 
> > > CORSAIR ValueSelect 1GB 184-Pin DDR SDRAM Unbuffered DDR 400 (PC 3200)
> > > System Memory - Retail
> > > 
> > > 
> > > 
> > > 
> > > 
> > > 
> > > CORSAIR XMS 2GB (2 x 1GB) 184-Pin DDR SDRAM Unbuffered DDR 400 (PC
> 3200)
> > > Dual Channel Kit System Memory - Retail
> > > 
> > > 
> > > 
> > > 
> > >
> > >
> > > *Below is the parsed query*
> > >
> > > 
> > > +(+(_text_:system) +(_text_:memory) +((_text_:oem) (_text_:retail)))
> > > ((_text_:"system memory") (_text_:"memory oem") (_text_:"oem retail"))
> > > 
> > >
> > > In case if you are in such scenarios where you need to knwo what query
> > will
> > > form, then you could us the debug=true to know more about the query &
> > > timings of different component.
> > >
> > > *And when the ps2 is not specified default ps will be applied on pf2.*
> > >
> > > I hope this helps.
> > >
> > > With Regards
> > > Aman Tandon
> > >
> > > On Mon, Jul 31, 2017 at 4:18 AM, Niraj Aswani 
> > > wrote:
> > >
> > > > Hi,
> > > >
> > > > I am using solr 4.4 and bit confused about how does the edismax
> parser
> > > > treat the pf2 parameter when both the AND and OR operators are used
> in
> > > the
> > > > query with ps2=0
> > > >
> > > > For example:
> > > >
> > > > pf2=title^100
> > > > q=HDMI AND Video AND (Wire OR Cable)
> > > >
> > > > Should I expect it to check the following bigram phrases?
> > > >
> > > > hdmi video
> > > > video wire
> > > > video cable
> > > >
> > > > Regards
> > > > Niraj
> > > >
> > >
> >
>


Re: omitNorms for short searchable fields and ID field

2017-08-25 Thread Atita Arora
Hi Chaula,

Omitnorms are basically used for index time boost & field length
normalization saying that I meant when you do omitNorms=true for any field
it stops storing additional stats regarding terms , length , boosts etc for
that field and hence drastically reduces the size of index.
It infact is advisable to turn this to true for small fields like you
mentioned you have in your schema.
It should not impact adversely on performance as long as you dont have a
high field length / index boost related operations in your use case.

-Atita

On Fri, Aug 25, 2017 at 12:19 PM, Chaula Ganatra  wrote:

> Hello
>
> We have a use case with very large index split on 2 shards and 2 replicas.
> Each shard has around 200GB data.
> We want to reduce our index size and for which we tried to do omit Norms
> for all the fields. We have also done it for the ID field and short
> searchable fields.
> We have observed that it has reduced the size to almost 50% . So we would
> like to go with this setting.
> Will it have any impact on performance or functionally? Please note we do
> not have any large text field for searching. But we have many different
> fields (dynamic fields) in our index.
>
> Regards
> Chaula
>
> [CC Award Winners!]
>
>


Re: High CPU utilization on Upgrading to Solr Version 6.3

2017-08-03 Thread Atita Arora
Hi All ,

Just thought of giving quick update on this.
So we were able to *knock down this issue by using jvisualvm* which comes
with java .
So , we enabled monitoring  through jmx and the CPU profiling showed (as
attached in one of my previous emails) *Highlighting taking maximum
processing.*
Mysteriously , this was happening in highlighting-> merge which was invoked
through when we enabled *mergecontiguous=true* I'm still surprised as to
turning this only property false, resolved the issue and we happily went
live last week.

Later , as I found the code for this particular property is causing endless
recursions as I traced.

Please guide / share if you may have any other thoughts.

Thanks,
Atita



On Fri, Jul 28, 2017 at 7:18 PM, Shawn Heisey <apa...@elyograg.org> wrote:

> On 7/27/2017 1:30 AM, Atita Arora wrote:
> > What OS is Solr running on?  I'm only asking because some additional
> > information I'm after has different gathering methods depending on OS.
> > Other questions:
> >
> > /*OpenJDK 64-Bit Server VM (25.141-b16) for linux-amd64 JRE
> > (1.8.0_141-b16), built on Jul 20 2017 21:47:59 by "mockbuild" with gcc
> > 4.4.7 20120313 (Red Hat 4.4.7-18)*/
> > /*Memory: 4k page, physical 264477520k(92198808k free), swap 0k(0k
> free)*/
>
> Linux is the easiest to get good information from.  Run the "top"
> program in a commandline session.  Press shift-M to sort by memory size,
> and grab a screenshot.  Share that screenshot with a file sharing site
> and give us the URL.
>
> > Is there only one Solr process per machine, or more than one?
> > /*On an average yes , one solr process per machine , however , we do
> > have a machine (where this log is taken) has two solr processes
> > (master and slave)*/
>
> Running a master and a slave on one machine does nothing for
> redundancy.  They need to be on separate machines for that to really
> help.  As for multiple processes per machine, tou can have many indexes
> in one Solr instance -- you don't need more than one in most cases.
>
> > How many total documents are managed by one machine?
> > */About 220945 per machine ( and double for this machine as it has
> > instance of master as well as other slave)/*
> >
> > How big is all the index data managed by one machine?
> > */The index is about 4G./*
>
> If less than a quarter of a million documents results in a 4GB index,
> those documents must be ENORMOUS, or else there is something strange
> going on.
>
> > What is the max heap on each Solr process?
> > */Max heap is 25G for each Solr Process. (Xms 25g Xmx 25g)/*
> > */
> > /*
> > The reason of choosing RAMDirectory was that it was used in the
> > similar manner while the production Solr was on Version 4.3.2, so no
> > particular reason but just replicated how it was working , never
> > thought this may give troubles.
>
> Set up the slaves just like the masters, with
> NRTCachingDirectoryFactory.  For a couple hundred thousand docs, you
> probably only need a 2GB heap, possibly even less.
>
> > I had included a pastebin of GC snapshot (the complete log was too big
> > to be included in the pastebin , so pasted a sampler)
>
> I asked for the full log because that's what I need to look deeper.  A
> sampler won't be enough.  There are file sharing websites for sharing
> larger content, and if you compress the file before uploading it, you
> should be able to achieve a fairly impressive compression ratio.
> Dropbox is generally a good choice for sharing fairly large content.
> Dropbox also works for image data, like the "top" screenshot I asked for
> above.
>
> > Another thing is as we observed the CPU cycles yesterday in high load
> > condition we observed that the Highlighter component was taking
> > longest , is there anything in particular we forgot to include that
> > highlighting doesn't gives a performance hit .
> > Attached is the snapshot taken from jvisualvm.
>
> Attachments rarely make it through the mailing list.  Yours didn't, so I
> cannot see that snapshot.
>
> I do not know anything about highlighting, so I cannot comment on how
> much CPU it takes.  I've never used the feature.
>
> My best idea about why your CPU is so high is problems with garbage
> collection.  To look into that, I need to have the full GC log.  The
> rest of the information I've asked for will help focus my efforts.
>
> Thanks,
> Shawn
>
>


Re: High CPU utilization on Upgrading to Solr Version 6.3

2017-07-27 Thread Atita Arora
Hi Shawn ,

Thank you for the pointers , here is the information :


What OS is Solr running on?  I'm only asking because some additional
information I'm after has different gathering methods depending on OS.
Other questions:

*OpenJDK 64-Bit Server VM (25.141-b16) for linux-amd64 JRE (1.8.0_141-b16),
built on Jul 20 2017 21:47:59 by "mockbuild" with gcc 4.4.7 20120313 (Red
Hat 4.4.7-18)*
*Memory: 4k page, physical 264477520k(92198808k free), swap 0k(0k free)*


Is there only one Solr process per machine, or more than one?
*On an average yes , one solr process per machine , however , we do have a
machine (where this log is taken) has two solr processes (master and slave)*

How many total documents are managed by one machine?
*About 220945 per machine ( and double for this machine as it has instance
of master as well as other slave)*

How big is all the index data managed by one machine?
*The index is about 4G.*

What is the max heap on each Solr process?
*Max heap is 25G for each Solr Process. (Xms 25g Xmx 25g)*

The reason of choosing RAMDirectory was that it was used in the similar
manner while the production Solr was on Version 4.3.2, so no particular
reason but just replicated how it was working , never thought this may give
troubles.

I had included a pastebin of GC snapshot (the complete log was too big to
be included in the pastebin , so pasted a sampler)

Another thing is as we observed the CPU cycles yesterday in high load
condition we observed that the Highlighter component was taking longest ,
is there anything in particular we forgot to include that highlighting
doesn't gives a performance hit .
Attached is the snapshot taken from jvisualvm.

Please guide.

Thanks,
Atita

On Thu, Jul 27, 2017 at 2:46 AM, Shawn Heisey <apa...@elyograg.org> wrote:

> On 7/26/2017 1:49 AM, Atita Arora wrote:
> > We did our functional and load testing on these boxes , however when we
> > released it to production along with the same application (using SolrJ to
> > query Solr) , we ran into severe CPU issues.
> > Just to add we're on Master - Slave where master has index on
> > NRTCachingDirectory
> > and Slave on RAMDirectory.
> >
> > As soon as we placed the slaves under load balancer , under NO LOAD
> > condition as well , the slave went from a load of 4 -> 10 -> 16 - > 100
> in
> > 12 mins.
> >
> > I suspected this to be caused due to replication but this is never
> ending ,
> > so before this crashed we de-provisioned it and brought it down.
> >
> > I'm not sure what could possibly cause it.
> >
> > I looked into the caches , where documentcache , filtercache ,
> > queryresultcaches are set to defaults 1024 and 100 documents.
> >
> > I tried observing the GC activity on GCViewer too , which does'nt really
> > shows something alarming (as in what I feel) - a sampler at
> > https://pastebin.com/cnuATYrS
>
> What OS is Solr running on?  I'm only asking because some additional
> information I'm after has different gathering methods depending on OS.
> Other questions:
>
> Is there only one Solr process per machine, or more than one?
> How many total documents are managed by one machine?
> How big is all the index data managed by one machine?
> What is the max heap on each Solr process?
>
> FYI, RAMDirectory is not the preferred way of running Solr or Lucene.
> If you have enough memory to hold the entire index, it's better to let
> the OS handle keeping that information in memory, rather than having
> Lucene and Java do it.
>
> http://blog.thetaphi.de/2012/07/use-lucenes-mmapdirectory-on-64bit.html
>
> NRTCachingDirectoryFactory uses MMap by default as its delegate
> implementation, so your master is fine.
>
> I would be interested in getting a copy of Solr's gc log from a system
> with high CPU to look at.
>
> Thanks,
> Shawn
>
>


High CPU utilization on Upgrading to Solr Version 6.3

2017-07-26 Thread Atita Arora
Hi ,

We upgrade our production Solr to version 6.3 from Version 4.3.2 about a
week ago.
We had our Dev / QA / staging on the same version (6.3) before finally
releasing the application leveraging Solr 6.3.

We did our functional and load testing on these boxes , however when we
released it to production along with the same application (using SolrJ to
query Solr) , we ran into severe CPU issues.
Just to add we're on Master - Slave where master has index on
NRTCachingDirectory
and Slave on RAMDirectory.

As soon as we placed the slaves under load balancer , under NO LOAD
condition as well , the slave went from a load of 4 -> 10 -> 16 - > 100 in
12 mins.

I suspected this to be caused due to replication but this is never ending ,
so before this crashed we de-provisioned it and brought it down.

I'm not sure what could possibly cause it.

I looked into the caches , where documentcache , filtercache ,
queryresultcaches are set to defaults 1024 and 100 documents.

I tried observing the GC activity on GCViewer too , which does'nt really
shows something alarming (as in what I feel) - a sampler at
https://pastebin.com/cnuATYrS

Can anyone possibly tell me the reasons ?

Thanks a lot in advance.

Atita


Re: Need guidance for distributing data base on date interval in a collection

2017-07-18 Thread Atita Arora
Hi Rehman,
I am not sure about your use case,  but why wouldn't you consider creating
shard for a particular date range like within a week from current date,  15
days,  a month and so on and so forth.

I have done a similar implementation elsewhere.
Can you tell more about your use case?

Atita

On Jul 18, 2017 1:04 PM, "rehman kahloon" 
wrote:

Hello Sir/MadamI am new to SolrCloud, Having ORACLE
technologies experience.
Now a days , i am comparing oracle and solrcloud using bigdata.
So i want to know how can i create time interval sharding.
e.g i have 10 machines, each machine for one shard and one date data, So
how can i fix next day data go to next shard and so on?

search too much but not found any command/way, that handle it from some
core/shard file.
So i request you please guide me.
thanks in advanced.
Kind Regards,Muhammad Rehman kahloonmrehman_kahl...@yahoo.com


Re: Cant stop/start server

2017-07-14 Thread Atita Arora
Did you mention the port with -p
Like

Bin/solr stop -p 8983

Please check

On Jul 14, 2017 10:35 PM, "Iridian Group"  wrote:

> I know I am missing something very simple here but I cant stop/start my
> Solr instance with
> /opt/solr/bin/solr stop
>
> I get “No Solr nodes found to stop”, however the server is running. I can
> access the server via the default port and my app is able to use its
> services without issue.
>
>
> Thanks for any assistance!
>
>
>
>
>
> Thanks
>
> Keith Savoie
> Vice President of Technology
>
> IRiDiAN GROUP
>
> Helping organizations brand
> & market themselves through
> web, print, & social media.
>
>
> 14450 Eagle Run Dr. Ste. 120
> Omaha, Nebraska 68116
>
> P  • 402.422.0150
> W • iridiangroup.com 
>
> Join us on facebook  or twitter <
> https://twitter.com/iridiangroup>
>


Re: SolrSpellChecker not showing suggestions when the first character of a word is wrong

2017-05-11 Thread Atita Arora
Hi Arun,

Try adding

  0

to your 
configuration.

It should work !

Thanks,
Atita

On Thu, May 11, 2017 at 6:34 AM, aruninfo100 
wrote:

> Hi All,
>
> I am trying to do spell check with Solr.I am able to get suggestions when
> the word is incorrectly spelled.
> Eg:-word entered(incorrectly) :*maintaan*
> I am getting *"maintain" *as suggestion,but if I provide *naintain*,it
> doesnt provide suggestions.
>
> *solrConfig:*
>
>  
> text_general
> 
> default
> spell_text
> solr.DirectSolrSpellChecker
> internal
> 0.5
> 
> 
> wordbreak
> solr.WordBreakSolrSpellChecker
> spell_text
> true
> true
> 5
> 5
> 
>
>
>  class="org.apache.solr.handler.component.SearchHandler">
>   
> default
> wordbreak
> true
> true
> 5
> 2
> 5
> true
> true
> 5
> 3
>  true
>  true
>   
>   
> spellcheck
>   
> 
>
> Kindly hep me on this.
>
> Thanks and Regards,
> Arun
>
>
>
> --
> View this message in context: http://lucene.472066.n3.
> nabble.com/SolrSpellChecker-not-showing-suggestions-when-
> the-first-character-of-a-word-is-wrong-tp4334554.html
> Sent from the Solr - User mailing list archive at Nabble.com.
>


Is it expected for Synonyms to work vice-versa

2017-05-01 Thread Atita Arora
Hi,

I have this strange issues happening today where I specified certain
keyword to match as synonym word as :

(^|[^a-zA-Z0-9])[cC][#]([^a-zA-Z0-9]|$)=>$1csharp$2

Which essentially means anyone searching for "C#" should be matched with a
document containing "csharp" too.

Now I have ran into something mysterious (atleast its a mystery for me !!
not sure if that's the expected behaviour) that someone searching for
"sharp" is matched with docs containing "#" , where as no other
configurations specifies this elsewhere.

Is this normal / expected ?

Please guide !

TIA -
Atita


Need help with date boost

2017-03-13 Thread Atita Arora
Hi all,

I am trying to resolve a problem here where I have to fiddle around with
set of dates ( created and updated date).
My use is that I have to make sure that the document with latest (recent)
 update date should come higher in my search results.
Precisely,  I am required to maintain 3 buckets wherein documents with
updated date falling in range of last 30 days should have maximum weight,
followed by update date in 60 and 90 and the rest.
However in cases where update date is unavailable I need to sort it using
created date.
I am not sure how do I achieve this.
Any insights here would be a great help.
Thanks in advance.
Regards,
Atita


Re: error during running my code java.lang.VerifyError: Bad type on operand stack

2017-01-06 Thread Atita Arora
Hi,

I found the same thing listed here at :

http://googleweblight.com/?lite_url=http://stackoverflow.com/questions/32105513/solr-bad-return-type-error=xQ0ZJDXt=en-IN=1=940=www.google.co.in=1483722084=AF9NedkxZ3LIU1o5BOd8inhSmW5Q5azbHA

HttpSolrClient has a constructor that accepts an HttpClient. When not
passed, it creates an internalClient that is a CloseableHttpClient.

So you can create a Default client and pass it as follows:

SystemDefaultHttpClient httpClient
=newSystemDefaultHttpClient();HttpSolrClient client
=newHttpSolrClient(url, httpClient);


I think the problem is the incorrect usage.
Can you try this?

Thanks,
Atita

On Jan 6, 2017 10:19 PM, "Susheel Kumar"  wrote:

Which solrj version are you using and can you point which line exactly
throws the error?

Thnx

On Fri, Jan 6, 2017 at 2:04 AM, gayathri...@tcs.com 
wrote:

> Hi
>
> Im using solr 5.4.0 while running my code i get below eroor please suggest
> what has to be done
>
> public static void main(String[] args) throws SolrServerException,
> IOException {
>
>
> String urlString = "http://localhost:8983/solr/;;
> SolrClient client = new HttpSolrClient(urlString);
> }
>
> Error :
>
> java.lang.VerifyError: Bad type on operand stack
> Exception Details:
>   Location:
>
> org/apache/http/impl/client/DefaultHttpClient.setDefaultHttpParams(Lorg/
> apache/http/params/HttpParams;)V
> @4: invokestatic
>   Reason:
> Type 'org/apache/http/HttpVersion' (current frame, stack[1]) is not
> assignable to 'org/apache/http/ProtocolVersion'
>
> please suggest what has to be done
>
>
>
> --
> View this message in context: http://lucene.472066.n3.
> nabble.com/error-during-running-my-code-java-lang-VerifyError-Bad-type-on-
> operand-stack-tp4312690.html
> Sent from the Solr - User mailing list archive at Nabble.com.
>