Re: copy field source not working in solr schema.xml

2016-04-27 Thread Andrea Gazzarini
Although what you pasted isn't the complete schema I guess you miss a

 wrote:

> Error :
>  org.apache.solr.common.SolrException:org.apache.solr.common.SolrException:
> Could not load conf for core demo7: copyField dest :'i_member_id' is not an
> explicit field and doesn't match a dynamicField.. Schema file is
> /opt/solr/example/solr/demo7/conf/schema.xml
>
> My schema.xml :
>
> required="true" multiValued="false" />
>  stored="true" required="false" />
>
> 
> 
>
> Please help me.
> Thanks in advance.
>
>
>
> --
> View this message in context:
> http://lucene.472066.n3.nabble.com/copy-field-source-not-working-in-solr-schema-xml-tp4273355.html
> Sent from the Solr - User mailing list archive at Nabble.com.
>


copy field source not working in solr schema.xml

2016-04-27 Thread kavurupavan
Error :
 org.apache.solr.common.SolrException:org.apache.solr.common.SolrException:
Could not load conf for core demo7: copyField dest :'i_member_id' is not an
explicit field and doesn't match a dynamicField.. Schema file is
/opt/solr/example/solr/demo7/conf/schema.xml

My schema.xml :







Please help me.
Thanks in advance.



--
View this message in context: 
http://lucene.472066.n3.nabble.com/copy-field-source-not-working-in-solr-schema-xml-tp4273355.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Decide on facets from results

2016-04-27 Thread Alexandre Rafalovitch
What about a custom component? Something similar to spell-checker? Add
it last after everything else.

It would have to be custom because you have some domain magic about
how to decide what fields to facet on.

Regards,
  Alex.

Newsletter and resources for Solr beginners and intermediates:
http://www.solr-start.com/


On 28 April 2016 at 11:45, Erick Erickson  wrote:
> Mark:
>
> You can do anything you want that Java can do ;). Smart-alec comments
> aside, there's
> no mechanism for doing this in Solr that I know of. The first thing
> I'd do is try the two-query-
> from-the-client approach to see if it was "fast enough".
>
> Best,
> Erick (the other one)
>
> On Wed, Apr 27, 2016 at 1:21 PM, Mark Robinson  
> wrote:
>> Thanks Eric!
>> So that will mean another call will be definitely required to SOLR with the
>> facets,  before the results can be send back (with the facet fields being
>> derived traversing through the response).
>>
>> I was basically checking on whether in the "process" method (I believe
>> results will be accessed in the process method), we can dynamically
>> generate facets after traversing through the results and identifying the
>> fields for faceting, using some aggregation function or so, without having
>> to make another call using facet=on=, before the
>> response is send back to the user.
>>
>> Cheers!
>>
>> On Wed, Apr 27, 2016 at 2:27 PM, Erik Hatcher 
>> wrote:
>>
>>> Results will vary based on how you indexed those fields, but sure…
>>> =on= - with sufficient RAM, lots of fun to be
>>> had!
>>>
>>> —
>>> Erik Hatcher, Senior Solutions Architect
>>> http://www.lucidworks.com 
>>>
>>>
>>>
>>> > On Apr 27, 2016, at 12:13 PM, Mark Robinson 
>>> wrote:
>>> >
>>> > Hi,
>>> >
>>> > If I don't have my facet list at query time, from the results can I
>>> select
>>> > some fields and by any means create a facet on them? ie after I get the
>>> > results I want to identify some fields as facets and send back facets for
>>> > them in the response.
>>> >
>>> > A kind of very dynamic faceting based on the results!
>>> >
>>> > Cld some one pls share their idea.
>>> >
>>> > Thanks!
>>> > Anil.
>>>
>>>


Re: measuring query performance & qps per node

2016-04-27 Thread Erick Erickson
In SolrCloud you can collect stats on pivot facets, see:
https://issues.apache.org/jira/browse/SOLR-6351

There are more buckets to count into and in SolrCloud you
have extra work to reconcile the partial results from
different shards.

Best,
Erick

On Mon, Apr 25, 2016 at 8:50 PM, Jay Potharaju  wrote:
> Thanks for the response Erick. I knew that it would depend on the number of
> factors like you mentioned.I just wanted to know whether a  good
> combination of queries, facets & filters should be a good estimate of how
> solr might behave.
>
> what did you mean by "Add stats to pivots in Cloud mode."
>
> Thanks
>
> On Mon, Apr 25, 2016 at 5:05 PM, Erick Erickson 
> wrote:
>
>>  Impossible to answer. For instance, a facet query can be very
>> heavy-duty. Add stats
>> to pivots in Cloud mode.
>>
>> As for using a bunch of fq clauses, It Depends (tm). If your expected usage
>> pattern is all queries like 'q=*:*=clause1=clause2" then it's
>> fine. It totally
>> falls down if, for instance, you have a bunch of facets. Or grouping.
>> Or.
>>
>> Best,
>> Erick
>>
>> On Mon, Apr 25, 2016 at 3:48 PM, Jay Potharaju 
>> wrote:
>> > Hi,
>> > I am trying to measure how will are queries performing ie how long are
>> they
>> > taking. In order to measure query speed I am using solrmeter with 50k
>> > unique filter queries. And then checking if any of the queries are slower
>> > than 50ms. Is this a good approach to measure query performance?
>> >
>> > Are there any guidelines on how to measure if a given instance can
>> handle a
>> > given number of qps(query per sec)? For example if my doc size is 30
>> > million docs and index size is 40 GB of data and the RAM on the instance
>> is
>> > 60 GB, then how many qps can it handle? Or is this a hard question to
>> > answer and it depends on the load and type of query running at a given
>> time.
>> >
>> > --
>> > Thanks
>> > Jay
>>
>
>
>
> --
> Thanks
> Jay Potharaju


Re: Dergraded performance between Solr 4 and Solr 5

2016-04-27 Thread Erick Erickson
This is rather strange, not sure what's going on here, I'll have
to leave it to others to speculate I'm afraid...

Although I do wonder what a profiling tool would show.

Best,
Erick

On Wed, Apr 27, 2016 at 8:51 AM, Jaroslaw Rozanski
 wrote:
> Ok, so here is interesting find.
>
> As my setup requires frequent (soft) commits cache brings little value.
> I tested following on Solr 5.5.0:
>
> q={!cache=false}*:*&
> fq={!cache=false}query1 /* not expensive */&
> fq={!cache=false cost=200}query2 /* expensive! */&
>
> Only with above set-up (and forcing Solr Post Filtering for expensive
> query, hence cost 200) I was able to return to Solr 4.10.3 performance.
>
> By Solr 4 performance I mean:
> - not only Solr 4 response times (roughly) for queries returning values,
> but also
> - very fast response for queries that have 0 results
>
> I wonder what could be the underlying cause.
>
> Thanks,
> Jarek
>
> On Wed, 27 Apr 2016, at 09:13, Jaroslaw Rozanski wrote:
>> Hi Eric,
>>
>> Measuring running queries via JMeter. Values provided are rounded median
>> of multiple samples. Medians are just slightly better than 99th
>> percentile for all samples.
>>
>> Filter cache is useless as you mentioned; they are effectively not used.
>> There is auto-warming through cache autoWarm but no auto-warming
>> queries.
>>
>> Small experiment with passing =... seems not to make any difference
>> which would not be surprising given caches are barely involved.
>>
>> Thanks for the suggestion on IO. After stopping indexing, the response
>> time barely changed on Solr 5. On Solr 4, with indexing running it is
>> still fast. So to effectively, Solr 4 under indexing load is faster than
>> idle Solr 5. Both set-ups have same heap size and available RAM on
>> machine (2x heap).
>>
>> One other thing I am testing is issuing request to specific core, with
>> distrib=false. No significant improvements there.
>>
>> Now what is interesting is that aforementioned query takes the same
>> amount of time to execute despite the number of documents found.
>> - Whether it is 0 or 10k, it takes couple seconds on Solr 5.5.0.
>> - Meanwhile, on Solr 4.10.3, the response time is dependent on results
>> size. For Solr 4 no results returns in few ms and few seconds for couple
>> thousands of results.
>> (query used {!cache=false}q=...)
>>
>>
>> Thanks,
>> Jarek
>>
>> On Wed, 27 Apr 2016, at 04:39, Erick Erickson wrote:
>> > Well, the first question is always "how are you measuring this"?
>> > Measuring a few queries is almost completely uninformative,
>> > especially if the two systems have differing warmups. The only
>> > meaningful measurements are when throwing away the first bunch
>> > of queries then measuring a meaningful sample.
>> >
>> > The setup you describe will be very sensitive to disk access
>> > with the autowarm of 1 second, so if there's much at all in
>> > the way of differences in I/O that would be a red flag.
>> >
>> > From here on down doesn't really respond to the question, but
>> > I thought I'd mention it.
>> >
>> > And you don't have to worry about disabling your fitlerCache since
>> > any filter query of the form fq=field:[mention NOW in here without
>> > rounding]
>> > never be re-used. So you might as well use {!cache=false}. Here's the
>> > background:
>> >
>> > https://lucidworks.com/blog/2012/02/23/date-math-now-and-filter-queries/
>> >
>> > And your soft commit is probably throwing out all the filter caches
>> > anyway.
>> >
>> > I doubt you're doing any autowarming at all given the autocommit interval
>> > of 1 second and continuously updating documents and your reported
>> > query times. So you can pretty much forget what I said about throwing
>> > out your first N queries since you're (probably) not getting any benefit
>> > out of caches anyway.
>> >
>> > On Tue, Apr 26, 2016 at 10:34 AM, Jaroslaw Rozanski
>> >  wrote:
>> > > Hi all,
>> > >
>> > > I am migrating a large Solr Cloud cluster from Solr 4.10 to Solr 5.5.0
>> > > and I observed big difference in query execution time.
>> > >
>> > > First a setup summary:
>> > > - multiple collections - 6
>> > > - each has multiple shards - 6
>> > > - same/similar hardware
>> > > - indexing tens of messages per second
>> > > - autoSoftCommit with 1s; hard commit few tens of seconds
>> > > - Java 8
>> > >
>> > > The query has following form: field1:[* TO NOW-14DAYS] OR (-field1:[* TO
>> > > *] AND field2:[* TO NOW-14DAYS])
>> > >
>> > > The fields field1 & field2 are of date type:
>> > > > > > positionIncrementGap="0"/>
>> > >
>> > > As query (q={!cache=false}...)
>> > > Solr 4.10 -> 5s
>> > > Solr 5.5.0 -> 12s
>> > >
>> > > As filter query (q={!cache=false}*:*=..,)
>> > > Solr 4.10 -> 9s
>> > > Solr 5.5.0 -> 11s
>> > >
>> > > The query itself is bad and its optimization aside, I am wondering if
>> > > there is anything in Lucene/Solr that would have such an impact on query
>> > > execution time between versions.
>> > >
>> > > 

Re: Decide on facets from results

2016-04-27 Thread Erick Erickson
Mark:

You can do anything you want that Java can do ;). Smart-alec comments
aside, there's
no mechanism for doing this in Solr that I know of. The first thing
I'd do is try the two-query-
from-the-client approach to see if it was "fast enough".

Best,
Erick (the other one)

On Wed, Apr 27, 2016 at 1:21 PM, Mark Robinson  wrote:
> Thanks Eric!
> So that will mean another call will be definitely required to SOLR with the
> facets,  before the results can be send back (with the facet fields being
> derived traversing through the response).
>
> I was basically checking on whether in the "process" method (I believe
> results will be accessed in the process method), we can dynamically
> generate facets after traversing through the results and identifying the
> fields for faceting, using some aggregation function or so, without having
> to make another call using facet=on=, before the
> response is send back to the user.
>
> Cheers!
>
> On Wed, Apr 27, 2016 at 2:27 PM, Erik Hatcher 
> wrote:
>
>> Results will vary based on how you indexed those fields, but sure…
>> =on= - with sufficient RAM, lots of fun to be
>> had!
>>
>> —
>> Erik Hatcher, Senior Solutions Architect
>> http://www.lucidworks.com 
>>
>>
>>
>> > On Apr 27, 2016, at 12:13 PM, Mark Robinson 
>> wrote:
>> >
>> > Hi,
>> >
>> > If I don't have my facet list at query time, from the results can I
>> select
>> > some fields and by any means create a facet on them? ie after I get the
>> > results I want to identify some fields as facets and send back facets for
>> > them in the response.
>> >
>> > A kind of very dynamic faceting based on the results!
>> >
>> > Cld some one pls share their idea.
>> >
>> > Thanks!
>> > Anil.
>>
>>


Solr UI to Display Hyperlinks

2016-04-27 Thread sheon banks
Hi All,

I have nutch configured with Solr.  Previous versions of Nutch has a search
screen which returns Hyperlinks.  How do I get the same functionality using
Solr?  Can someone point me to the documentations which discusses how to
modify the Solr UI and returns links?

sheon


Re: ANN: Solr puzzle: Magic Date

2016-04-27 Thread Alexandre Rafalovitch
Thank you for the feedback Reth. You have interesting comments I'll
keep in mind.

This particular attempt was more specifically a puzzle - a combination
of Solr features that give a surprising result. They would probably be
a bit tricky for interviews.

I have thought about 'scenario implementing' problems as well, but
that would take a slightly different approach.

Regards,
   Alex.

Newsletter and resources for Solr beginners and intermediates:
http://www.solr-start.com/


On 27 April 2016 at 17:59, Reth RM  wrote:
> Yes, these can be practice/interview questions. But, considering the
> specific example above, it seems like question is pertaining to plot
> syntactically error(?);  it is not expected that developer/solr-user know
> right syntax or commands. What could be interesting is, questions related
> to cloud concepts, ranking concepts (tf-idf, bm25) or simple problem
> statements that may ask how can this be implemented using solr's ootb
> features/apis, and so on. If such are the upcoming puzzles question, I'm
> sure they will be useful.
> I liked the idea.
>
>
> On Tue, Apr 26, 2016 at 5:49 PM, Alexandre Rafalovitch 
> wrote:
>
>> I am doing an experiment in teaching about Solr. I've created a Solr
>> puzzle and want to know whether people would find it useful to do more
>> of these. My mailing list have seen this already, but I would love the
>> feedback from a wider Solr audience as well. Privately or on the list.
>>
>> The - first - puzzle is deceptively simple:
>>
>> --
>> Given the following sequence of commands (for Solr 5.5 or 6.0):
>>
>> 1. bin/solr create_core -c puzzle_date
>> 2. bin/post -c puzzle_date -type text/csv -d $'today\n2016-04-08'
>> 3. curl http://localhost:8983/solr/puzzle_date/select?q=Fri
>>
>> --
>> Would the result be:
>>
>> 1.Error in the command 1 for not providing a configuration directory
>> 2.Error in the command 2 for missing a uniqueKey field
>> 3.Error in the command 2 due to an incorrect date format
>> 4.No records in the command 3 output
>> 5.One record in the command 3 output
>> --
>>
>> You can find the answer and full in-depth explanation at:
>> http://blog.outerthoughts.com/2016/04/solr-5-puzzle-magic-date-answer/
>>
>> Again, what I am trying to understand is whether that's somehow useful
>> to people and worth making time to create and write-up.
>>
>> Any feedback would be appreciated.
>>
>> Regards,
>> Alex.
>>
>> 
>> Newsletter and resources for Solr beginners and intermediates:
>> http://www.solr-start.com/
>>


Re: Cross collection join in Solr 5.x

2016-04-27 Thread Shikha Somani
We have identified fix for this issue. Please refer 
defect? comments section.

Thanks,
Shikha


From: Susmit Shukla 
Sent: 21 April 2016 19:06
To: solr-user@lucene.apache.org
Subject: Re: Cross collection join in Solr 5.x

I have done it by extending the solr join plugin. Needed to override 2
methods from join plugin and it works out.

Thanks,
Susmit

On Thu, Apr 21, 2016 at 12:01 PM, Mikhail Khludnev <
mkhlud...@griddynamics.com> wrote:

> Hello,
>
> There is no much progress on
> https://issues.apache.org/jira/browse/SOLR-8297
> Although it's really achievable.
>
> On Thu, Apr 21, 2016 at 7:52 PM, Shikha Somani 
> wrote:
>
> > Greetings,
> >
> >
> > Background: Our application is using Solr 4.10 and has multiple
> > collections all of them sharded equally on Solr. These collections were
> > joined to support complex queries.
> >
> >
> > Problem: We are trying to upgrade to Solr 5.x. However from Solr 5.2
> > onward to join two collections it is a requirement that the secondary
> > collection must be singly sharded and replicated where primary collection
> > is. But collections are very large and need to be sharded for
> performance.
> >
> >
> > Query: Is there any way in Solr 5.x to join two collections both of which
> > are equally sharded i.e. the secondary collection is also sharded as the
> > primary.
> >
> >
> > Thanks,
> > Shikha
> >
> > 
> >
> >
> >
> >
> >
> >
> > NOTE: This message may contain information that is confidential,
> > proprietary, privileged or otherwise protected by law. The message is
> > intended solely for the named addressee. If received in error, please
> > destroy and notify the sender. Any use of this email is prohibited when
> > received in error. Impetus does not represent, warrant and/or guarantee,
> > that the integrity of this communication has been maintained nor that the
> > communication is free of errors, virus, interception or interference.
> >
>
>
>
> --
> Sincerely yours
> Mikhail Khludnev
> Principal Engineer,
> Grid Dynamics
>
> ;
> 
>








NOTE: This message may contain information that is confidential, proprietary, 
privileged or otherwise protected by law. The message is intended solely for 
the named addressee. If received in error, please destroy and notify the 
sender. Any use of this email is prohibited when received in error. Impetus 
does not represent, warrant and/or guarantee, that the integrity of this 
communication has been maintained nor that the communication is free of errors, 
virus, interception or interference.


MoreLikeThis Handler Solr 4.9

2016-04-27 Thread Esan London
Hi all,

I set up a morelikethis request in solr 4.9 handler and it works fine for 
fields that are text but when I try to use a float field the results "seem" 
like it treats the float value as text eg. For the input document with a float 
field value of 3, in the top ten results for mlt a document with the value 
3000 would be returned.

Has anyone ever seen this?
thanks


Re: Decide on facets from results

2016-04-27 Thread Mark Robinson
Thanks Eric!
So that will mean another call will be definitely required to SOLR with the
facets,  before the results can be send back (with the facet fields being
derived traversing through the response).

I was basically checking on whether in the "process" method (I believe
results will be accessed in the process method), we can dynamically
generate facets after traversing through the results and identifying the
fields for faceting, using some aggregation function or so, without having
to make another call using facet=on=, before the
response is send back to the user.

Cheers!

On Wed, Apr 27, 2016 at 2:27 PM, Erik Hatcher 
wrote:

> Results will vary based on how you indexed those fields, but sure…
> =on= - with sufficient RAM, lots of fun to be
> had!
>
> —
> Erik Hatcher, Senior Solutions Architect
> http://www.lucidworks.com 
>
>
>
> > On Apr 27, 2016, at 12:13 PM, Mark Robinson 
> wrote:
> >
> > Hi,
> >
> > If I don't have my facet list at query time, from the results can I
> select
> > some fields and by any means create a facet on them? ie after I get the
> > results I want to identify some fields as facets and send back facets for
> > them in the response.
> >
> > A kind of very dynamic faceting based on the results!
> >
> > Cld some one pls share their idea.
> >
> > Thanks!
> > Anil.
>
>


Re: Replicas for same shard not in sync

2016-04-27 Thread Jeff Wartes
I didn’t leave it out, I was asking what it was. I’ve been reading around some 
more this morning though, and here’s what I’ve come up with, feel free to 
correct.

Continuing my scenario:

If you did NOT specify min_rf
5. leader sets leader_initiated_recovery in ZK for the replica with the 
failure. Hopefully that replica notices and re-syncs at some point, because it 
can’t become a leader until it does. (SOLR-5495, SOLR-8034)
6. leader returns success to the client (http://bit.ly/1UhB2cF)

If you specified a min_rf and it WAS achieved:

5. leader sets leader_initiated_recovery in ZK for the replica with the failure.

6. leader returns success (and the achieved rf) to the client (SOLR-5468, 
SOLR-8062)


If you specified a min_rf and it WASN'T achieved:
5. leader does NOT set leader_initiated_recovery (SOLR-8034)
6. leader returns success (and the achieved rf) to the client (SOLR-5468, 
SOLR-8062)

I couldn’t seem to find anyplace that’d cause an error return to the client, 
aside from race conditions around who the leader should be, or if the update 
couldn’t be applied to the leader itself.






On 4/26/16, 8:22 PM, "Erick Erickson"  wrote:

>You left out step 5... leader responds with fail for the update to the
>client. At this point, the client is in charge of retrying the docs.
>Retrying will update all the docs that were successfully indexed in
>the failed packet, but that's not unusual.
>
>There's no real rollback semantics that I know of. This is analogous
>to not hitting minRF, see:
>https://support.lucidworks.com/hc/en-us/articles/212834227-How-does-indexing-work-in-SolrCloud.
>In particular the bit about "it is the client's responsibility to
>re-send it"...
>
>There's some retry logic in the code that distributes the updates from
>the leader as well.
>
>Best,
>Erick
>
>On Tue, Apr 26, 2016 at 12:51 PM, Jeff Wartes  wrote:
>>
>> At the risk of thread hijacking, this is an area where I don’t know I fully 
>> understand, so I want to make sure.
>>
>> I understand the case where a node is marked “down” in the clusterstate, but 
>> what if it’s down for less than the ZK heartbeat? That’s not unreasonable, 
>> I’ve seen some recommendations for really high ZK timeouts. Let’s assume 
>> there’s some big GC pause, or some other ephemeral service interruption that 
>> recovers very quickly.
>>
>> So,
>> 1. leader gets an update request
>> 2. leader makes update requests to all live nodes
>> 3. leader gets success responses from all but one replica
>> 4. leader gets failure response from one replica
>>
>> At this point we have different replicas with different data sets. Does 
>> anything signal that the failure-response node has now diverged? Does the 
>> leader attempt to roll back the other replicas? I’ve seen references to 
>> leader-initiated-recovery, is this that?
>>
>> And regardless, is the update request considered a success (and reported as 
>> such to the client) by the leader?
>>
>>
>>
>> On 4/25/16, 12:14 PM, "Erick Erickson"  wrote:
>>
>>>Ted:
>>>Yes, deleting and re-adding the replica will be fine.
>>>
>>>Having commits happen from the client when you _also_ have
>>>autocommits that frequently (10 seconds and 1 second are pretty
>>>aggressive BTW) is usually not recommended or necessary.
>>>
>>>David:
>>>
>>>bq: if one or more replicas are down, updates presented to the leader
>>>still succeed, right?  If so, tedsolr is correct that the Solr client
>>>app needs to re-issue update
>>>
>>>Absolutely not the case. When the replicas are down, they're marked as
>>>down by Zookeeper. When then come back up they find the leader through
>>>Zookeeper magic and ask, essentially "Did I miss any updates"? If the
>>>replica did miss any updates it gets them from the leader either
>>>through the leader replaying the updates from its transaction log to
>>>the replica or by replicating the entire index from the leader. Which
>>>path is followed is a function of how far behind the replica is.
>>>
>>>In this latter case, any updates that come in to the leader while the
>>>replication is happening are buffered and replayed on top of the index
>>>when the full replication finishes.
>>>
>>>The net-net here is that you should not have to track whether updates
>>>got to all the replicas or not. One of the major advantages of
>>>SolrCloud is to remove that worry from the indexing client...
>>>
>>>Best,
>>>Erick
>>>
>>>On Mon, Apr 25, 2016 at 11:39 AM, David Smith
>>> wrote:
 Erick,

 So that my understanding is correct, let me ask, if one or more replicas 
 are down, updates presented to the leader still succeed, right?  If so, 
 tedsolr is correct that the Solr client app needs to re-issue updates, if 
 it wants stronger guarantees on replica consistency than what Solr 
 provides.

 The “Write Fault Tolerance” section of the Solr Wiki makes what I believe 
 is 

Re: Decide on facets from results

2016-04-27 Thread Erik Hatcher
Results will vary based on how you indexed those fields, but sure… 
=on= - with sufficient RAM, lots of fun to be had!

—
Erik Hatcher, Senior Solutions Architect
http://www.lucidworks.com 



> On Apr 27, 2016, at 12:13 PM, Mark Robinson  wrote:
> 
> Hi,
> 
> If I don't have my facet list at query time, from the results can I select
> some fields and by any means create a facet on them? ie after I get the
> results I want to identify some fields as facets and send back facets for
> them in the response.
> 
> A kind of very dynamic faceting based on the results!
> 
> Cld some one pls share their idea.
> 
> Thanks!
> Anil.



Re: Questions on SolrCloud core state, when will Solr recover a "DOWN" core to "ACTIVE" core.

2016-04-27 Thread Li Ding
Hi Erick,

I don't have the GC log.  But after the GC finished.  Isn't zk ping
succeeds and the core should be back to normal state?  From the log I
posted.  The sequence is:

1) Solr Detects itself can't connect to ZK and reconnect to ZK
2) Solr marked all cores are down
3) Solr recovery each cores, some succeeds, some failed.
4) After 30 minutes, the cores that are failed still marked as down.

So my questions is, during the 30 minutes interval, if GC takes too long,
all cores should failed.  And GC doesn't take longer than a minute since
all serving requests to other calls succeeds and the next zk ping should
bring the core back to normal? right?  We have an active monitor running at
the same time querying every core in distrib=false mode and every query
succeeds.

Thanks,

Li

On Tue, Apr 26, 2016 at 6:20 PM, Erick Erickson 
wrote:

> One of the reasons this happens is if you have very
> long GC cycles, longer than the Zookeeper "keep alive"
> timeout. During a full GC pause, Solr is unresponsive and
> if the ZK ping times out, ZK assumes the machine is
> gone and you get into this recovery state.
>
> So I'd collect GC logs and see if you have any
> stop-the-world GC pauses that take longer than the ZK
> timeout.
>
> see Mark Millers primer on GC here:
> https://lucidworks.com/blog/2011/03/27/garbage-collection-bootcamp-1-0/
>
> Best,
> Erick
>
> On Tue, Apr 26, 2016 at 2:13 PM, Li Ding  wrote:
> > Thank you all for your help!
> >
> > The zookeeper log rolled over, thisis from Solr.log:
> >
> > Looks like the solr and zk connection is gone for some reason
> >
> > INFO  - 2016-04-21 12:37:57.536;
> > org.apache.solr.common.cloud.ConnectionManager; Watcher
> > org.apache.solr.common.cloud.ConnectionManager@19789a96
> > name:ZooKeeperConnection Watcher:{ZK HOSTS here} got event WatchedEvent
> > state:Disconnected type:None path:null path:null type:None
> >
> > INFO  - 2016-04-21 12:37:57.536;
> > org.apache.solr.common.cloud.ConnectionManager; zkClient has disconnected
> >
> > INFO  - 2016-04-21 12:38:24.248;
> > org.apache.solr.common.cloud.DefaultConnectionStrategy; Connection
> expired
> > - starting a new one...
> >
> > INFO  - 2016-04-21 12:38:24.262;
> > org.apache.solr.common.cloud.ConnectionManager; Waiting for client to
> > connect to ZooKeeper
> >
> > INFO  - 2016-04-21 12:38:24.269;
> > org.apache.solr.common.cloud.ConnectionManager; Connected:true
> >
> >
> > Then it publishes all cores on the hosts are down.  I just list three
> cores
> > here:
> >
> > INFO  - 2016-04-21 12:38:24.269; org.apache.solr.cloud.ZkController;
> > publishing core=product1_shard1_replica1 state=down
> >
> > INFO  - 2016-04-21 12:38:24.271; org.apache.solr.cloud.ZkController;
> > publishing core=collection1 state=down
> >
> > INFO  - 2016-04-21 12:38:24.272; org.apache.solr.cloud.ZkController;
> > numShards not found on descriptor - reading it from system property
> >
> > INFO  - 2016-04-21 12:38:24.289; org.apache.solr.cloud.ZkController;
> > publishing core=product2_shard5_replica1 state=down
> >
> > INFO  - 2016-04-21 12:38:24.292; org.apache.solr.cloud.ZkController;
> > publishing core=product2_shard13_replica1 state=down
> >
> >
> > product1 has only one shard one replica and it's able to be active
> > successfully:
> >
> > INFO  - 2016-04-21 12:38:26.383; org.apache.solr.cloud.ZkController;
> > Register replica - core:product1_shard1_replica1 address:http://
> > {internalIp}:8983/solr collection:product1 shard:shard1
> >
> > WARN  - 2016-04-21 12:38:26.385; org.apache.solr.cloud.ElectionContext;
> > cancelElection did not find election node to remove
> >
> > INFO  - 2016-04-21 12:38:26.393;
> > org.apache.solr.cloud.ShardLeaderElectionContext; Running the leader
> > process for shard shard1
> >
> > INFO  - 2016-04-21 12:38:26.399;
> > org.apache.solr.cloud.ShardLeaderElectionContext; Enough replicas found
> to
> > continue.
> >
> > INFO  - 2016-04-21 12:38:26.399;
> > org.apache.solr.cloud.ShardLeaderElectionContext; I may be the new
> leader -
> > try and sync
> >
> > INFO  - 2016-04-21 12:38:26.399; org.apache.solr.cloud.SyncStrategy; Sync
> > replicas to http://{internalIp}:8983/solr/product1_shard1_replica1/
> >
> > INFO  - 2016-04-21 12:38:26.399; org.apache.solr.cloud.SyncStrategy; Sync
> > Success - now sync replicas to me
> >
> > INFO  - 2016-04-21 12:38:26.399; org.apache.solr.cloud.SyncStrategy;
> > http://{internalIp}:8983/solr/product1_shard1_replica1/
> > has no replicas
> >
> > INFO  - 2016-04-21 12:38:26.399;
> > org.apache.solr.cloud.ShardLeaderElectionContext; I am the new leader:
> > http://{internalIp}:8983/solr/product1_shard1_replica1/ shard1
> >
> > INFO  - 2016-04-21 12:38:26.399;
> org.apache.solr.common.cloud.SolrZkClient;
> > makePath: /collections/product1/leaders/shard1
> >
> > INFO  - 2016-04-21 12:38:26.412; org.apache.solr.cloud.ZkController; We
> are
> > http://{internalIp}:8983/solr/product1_shard1_replica1/ and leader is
> > 

Re: Tuning solr for large index with rapid writes

2016-04-27 Thread Stephen Lewis
​>
If I'm reading this right, you have 420M docs on a single shard?
Yep, you were reading it right. Thanks for your guidance. We will do
various prototyping following "the sizing exercise".

Best,
Stephen

On Tue, Apr 26, 2016 at 6:17 PM, Erick Erickson 
wrote:

> ​​
> If I'm reading this right, you have 420M docs on a single shard? If that's
> true
> you are pushing the envelope of what I've seen work and be performant. Your
> OOM errors are the proverbial 'smoking gun' that you're putting too many
> docs
> on too few nodes.
>
> You say that the document count is "growing quite rapidly". My expectation
> is
> that your problems will only get worse as you cram more docs into your
> shard.
>
> You're correct that adding more memory (and consequently more JVM
> memory?) only gets you so far before you start running into GC trouble,
> when you hit full GC pauses they'll get longer and longer which is its own
> problem. And you don't want to have huge JVM memory at the expense
> of op system memory due the fact that Lucene uses MMapDirectory, see
> Uwe's excellent blog:
> http://blog.thetaphi.de/2012/07/use-lucenes-mmapdirectory-on-64bit.html
>
> I'd _strongly_ recommend you do "the sizing exercise". There are lots of
> details here:
>
> https://lucidworks.com/blog/2012/07/23/sizing-hardware-in-the-abstract-why-we-dont-have-a-definitive-answer/
>
> You've already done some of this inadvertently, unfortunately it sounds
> like
> it's in production. If I were going to guess, I'd say the maximum number of
> docs on any shard should be less than half what you currently have. So you
> need to figure out how many docs you expect to host in this collection
> eventually
> and have N/200M shards. At least.
>
> There are various strategies when the answer is "I don't know", you
> might add new
> collections when you max out and then use "collection aliasing" to
> query them etc.
>
> Best,
> Erick
>
> On Tue, Apr 26, 2016 at 3:49 PM, Stephen Lewis  wrote:
> > Hello,
> >
> > I'm looking for some guidance on the best steps for tuning a solr cloud
> > cluster which is heavy on writes. We are currently running a solr cloud
> > fleet composed of one core, one shard, and three nodes. The cloud is
> hosted
> > in AWS, and each solr node is on its own linux r3.2xl instance with 8 cpu
> > and 61 GiB mem, and a 2TB EBS volume attached. Our index is currently 550
> > GiB over 420M documents, and growing quite rapidly. We are currently
> doing
> > a bit more than 1000 document writes/deletes per second.
> >
> > Recently, we've hit some trouble with our production cloud. We have had
> the
> > process on individual instances die a few times, and we see the following
> > error messages being logged (expanded logs at the bottom of the email):
> >
> > ERROR - 2016-04-26 00:56:43.873; org.apache.solr.common.SolrException;
> > null:org.eclipse.jetty.io.EofException
> >
> > WARN  - 2016-04-26 00:55:29.571;
> org.eclipse.jetty.servlet.ServletHandler;
> > /solr/panopto/select
> > java.lang.IllegalStateException: Committed
> >
> > WARN  - 2016-04-26 00:55:29.571; org.eclipse.jetty.server.Response;
> > Committed before 500 {trace=org.eclipse.jetty.io.EofException
> >
> >
> > Another time we saw this happen, we had java OOM errors (expanded logs at
> > the bottom):
> >
> > WARN  - 2016-04-25 22:58:43.943;
> org.eclipse.jetty.servlet.ServletHandler;
> > Error for /solr/panopto/select
> > java.lang.OutOfMemoryError: Java heap space
> > ERROR - 2016-04-25 22:58:43.945; org.apache.solr.common.SolrException;
> > null:java.lang.RuntimeException: java.lang.OutOfMemoryError: Java heap
> space
> > ...
> > Caused by: java.lang.OutOfMemoryError: Java heap space
> >
> >
> > When the cloud goes into recovery during live indexing, it takes about
> 4-6
> > hours for a node to recover, but when we turn off indexing, recovery only
> > takes about 90 minutes.
> >
> > Moreover, we see that deletes are extremely slow. We do batch deletes of
> > about 300 documents based on two value filters, and this takes about one
> > minute:
> >
> > Research online suggests that a larger disk cache
> >  could be helpful,
> > but I also see from an older page
> >  on tuning for
> > Lucene that turning down the swappiness on our Linux instances may be
> > preferred to simply increasing space for the disk cache.
> >
> > Moreover, to scale in the past, we've simply rolled our cluster while
> > increasing the memory on the new machines, but I wonder if we're hitting
> > the limit for how much we should scale vertically. My impression is that
> > sharding will allow us to warm searchers faster and maintain a more
> > effective cache as we scale. Will we really be helped by sharding, or is
> it
> > only a matter of total CPU/Memory in the cluster?
> >
> > Thanks!
> >
> > Stephen
> >
> > (206)753-9320
> > stephen-lewis.net
> >
> > 

Decide on facets from results

2016-04-27 Thread Mark Robinson
Hi,

If I don't have my facet list at query time, from the results can I select
some fields and by any means create a facet on them? ie after I get the
results I want to identify some fields as facets and send back facets for
them in the response.

A kind of very dynamic faceting based on the results!

Cld some one pls share their idea.

Thanks!
Anil.


Re: Dergraded performance between Solr 4 and Solr 5

2016-04-27 Thread Jaroslaw Rozanski
Ok, so here is interesting find. 

As my setup requires frequent (soft) commits cache brings little value.
I tested following on Solr 5.5.0:

q={!cache=false}*:*&
fq={!cache=false}query1 /* not expensive */&
fq={!cache=false cost=200}query2 /* expensive! */&

Only with above set-up (and forcing Solr Post Filtering for expensive
query, hence cost 200) I was able to return to Solr 4.10.3 performance.

By Solr 4 performance I mean:
- not only Solr 4 response times (roughly) for queries returning values,
but also
- very fast response for queries that have 0 results 

I wonder what could be the underlying cause.

Thanks,
Jarek

On Wed, 27 Apr 2016, at 09:13, Jaroslaw Rozanski wrote:
> Hi Eric,
> 
> Measuring running queries via JMeter. Values provided are rounded median
> of multiple samples. Medians are just slightly better than 99th
> percentile for all samples. 
> 
> Filter cache is useless as you mentioned; they are effectively not used.
> There is auto-warming through cache autoWarm but no auto-warming
> queries. 
> 
> Small experiment with passing =... seems not to make any difference
> which would not be surprising given caches are barely involved.
> 
> Thanks for the suggestion on IO. After stopping indexing, the response
> time barely changed on Solr 5. On Solr 4, with indexing running it is
> still fast. So to effectively, Solr 4 under indexing load is faster than
> idle Solr 5. Both set-ups have same heap size and available RAM on
> machine (2x heap).
> 
> One other thing I am testing is issuing request to specific core, with
> distrib=false. No significant improvements there.
> 
> Now what is interesting is that aforementioned query takes the same
> amount of time to execute despite the number of documents found. 
> - Whether it is 0 or 10k, it takes couple seconds on Solr 5.5.0.
> - Meanwhile, on Solr 4.10.3, the response time is dependent on results
> size. For Solr 4 no results returns in few ms and few seconds for couple
> thousands of results. 
> (query used {!cache=false}q=...)
>   
> 
> Thanks,
> Jarek
> 
> On Wed, 27 Apr 2016, at 04:39, Erick Erickson wrote:
> > Well, the first question is always "how are you measuring this"?
> > Measuring a few queries is almost completely uninformative,
> > especially if the two systems have differing warmups. The only
> > meaningful measurements are when throwing away the first bunch
> > of queries then measuring a meaningful sample.
> > 
> > The setup you describe will be very sensitive to disk access
> > with the autowarm of 1 second, so if there's much at all in
> > the way of differences in I/O that would be a red flag.
> > 
> > From here on down doesn't really respond to the question, but
> > I thought I'd mention it.
> > 
> > And you don't have to worry about disabling your fitlerCache since
> > any filter query of the form fq=field:[mention NOW in here without
> > rounding]
> > never be re-used. So you might as well use {!cache=false}. Here's the
> > background:
> > 
> > https://lucidworks.com/blog/2012/02/23/date-math-now-and-filter-queries/
> > 
> > And your soft commit is probably throwing out all the filter caches
> > anyway.
> > 
> > I doubt you're doing any autowarming at all given the autocommit interval
> > of 1 second and continuously updating documents and your reported
> > query times. So you can pretty much forget what I said about throwing
> > out your first N queries since you're (probably) not getting any benefit
> > out of caches anyway.
> > 
> > On Tue, Apr 26, 2016 at 10:34 AM, Jaroslaw Rozanski
> >  wrote:
> > > Hi all,
> > >
> > > I am migrating a large Solr Cloud cluster from Solr 4.10 to Solr 5.5.0
> > > and I observed big difference in query execution time.
> > >
> > > First a setup summary:
> > > - multiple collections - 6
> > > - each has multiple shards - 6
> > > - same/similar hardware
> > > - indexing tens of messages per second
> > > - autoSoftCommit with 1s; hard commit few tens of seconds
> > > - Java 8
> > >
> > > The query has following form: field1:[* TO NOW-14DAYS] OR (-field1:[* TO
> > > *] AND field2:[* TO NOW-14DAYS])
> > >
> > > The fields field1 & field2 are of date type:
> > >  > > positionIncrementGap="0"/>
> > >
> > > As query (q={!cache=false}...)
> > > Solr 4.10 -> 5s
> > > Solr 5.5.0 -> 12s
> > >
> > > As filter query (q={!cache=false}*:*=..,)
> > > Solr 4.10 -> 9s
> > > Solr 5.5.0 -> 11s
> > >
> > > The query itself is bad and its optimization aside, I am wondering if
> > > there is anything in Lucene/Solr that would have such an impact on query
> > > execution time between versions.
> > >
> > > Originally I though it might be related to
> > > https://issues.apache.org/jira/browse/SOLR-8251 and testing on small
> > > scale proved that there is a difference in performance. However upgraded
> > > version is already 5.5.0.
> > >
> > >
> > >
> > > Thanks,
> > > Jarek
> > >


RE: solr | backup and restoration

2016-04-27 Thread Prateek Jain J

Manually copying files under index directory fixed the issue. 


Regards,
Prateek Jain

-Original Message-
From: Prateek Jain J [mailto:prateek.j.j...@ericsson.com] 
Sent: 27 April 2016 02:08 PM
To: solr-user@lucene.apache.org
Subject: solr | backup and restoration


Hi,

We are using solr 4.8.1 in production and want to create backups at runtime. As 
per the reference guide,  we can create backup using something like this:

http://localhost:8983/solr/myCore/replication?command=backup=/tmp/myBackup=1

and we verified that some file are getting created in /tmp/myBackup directory. 
The issue that we are facing is, how to restore everything using this backup.
 Admin guide does talk about "Merging Indexes" using two methods:


a.   indexDir for example,

http://localhost:8983/solr/admin/cores?action=mergeindexes=core0=/home/solr/core1/data/index;

indexDir=/home/solr/core2/data/index



b.  srcCore for example,

http://localhost:8983/solr/admin/cores?action=mergeindexes=core0=core1=core2



these are not working in our case as, we want entire data should also be back 
there for example, if we want to re-create core from a snapshot. I do see there 
is such functionality available in later versions as, described here

https://cwiki.apache.org/confluence/display/solr/Making+and+Restoring+Backups+of+SolrCores


Regards,
Prateek Jain



solr | backup and restoration

2016-04-27 Thread Prateek Jain J

Hi,

We are using solr 4.8.1 in production and want to create backups at runtime. As 
per the reference guide,  we can create backup using something like this:

http://localhost:8983/solr/myCore/replication?command=backup=/tmp/myBackup=1

and we verified that some file are getting created in /tmp/myBackup directory. 
The issue that we are facing is, how to restore everything using this backup.
 Admin guide does talk about "Merging Indexes" using two methods:


a.   indexDir for example,

http://localhost:8983/solr/admin/cores?action=mergeindexes=core0=/home/solr/core1/data/index;

indexDir=/home/solr/core2/data/index



b.  srcCore for example,

http://localhost:8983/solr/admin/cores?action=mergeindexes=core0=core1=core2



these are not working in our case as, we want entire data should also be back 
there for example, if we want to re-create core from a snapshot. I do see there 
is such functionality available in later versions as, described here

https://cwiki.apache.org/confluence/display/solr/Making+and+Restoring+Backups+of+SolrCores


Regards,
Prateek Jain



Re: How can i pass in solr dynamic values to a particular field

2016-04-27 Thread Erik Hatcher
A couple of options - param substitution -

fq=maths:${maths_v} AND science:${science_v}

where _v=25 and _v=30

And one I like better because it’s a more precisely defined query:

   fq=({!term f=maths v=$maths_v}) AND {!term f=science v=$science_v}


—
Erik Hatcher, Senior Solutions Architect
http://www.lucidworks.com 



> On Apr 27, 2016, at 6:47 AM, kavurupavan  wrote:
> 
> Solr url :
> 
> Example:
> 
> http://localhost:8983/solr/searching/select?q=*:*=maths:25 AND science:
> 30
> 
> These 25 and 30 i need to pass dynamically values. How can I pass them as
> part of a Solr URL? 
> Please help me. Thanks in advance.
> 
> 
> 
> --
> View this message in context: 
> http://lucene.472066.n3.nabble.com/How-can-i-pass-in-solr-dynamic-values-to-a-particular-field-tp4273107.html
> Sent from the Solr - User mailing list archive at Nabble.com.



RE: How can i pass in solr dynamic values to a particular field

2016-04-27 Thread rajeshkumar . s
HI kavurupavan,
From which language you need to pass Dynamic Values.
 
 
 
---
Thanks and regards,
Rajesh Kumar Sountarrajan
Software Developer / DBA Developer - IT Team
 
Mobile: 91 - 9600984804
Email - rajeshkuma...@maxval-ip.com
 
 
- Original Message - Subject: How can i pass in solr dynamic 
values to a particular field
From: "kavurupavan" 
Date: 4/27/16 4:17 pm
To: solr-user@lucene.apache.org

Solr url :
 
 Example:
 
 http://localhost:8983/solr/searching/select?q=*:*=maths:25 AND science:
 30
 
 These 25 and 30 i need to pass dynamically values. How can I pass them as
 part of a Solr URL? 
 Please help me. Thanks in advance.
 
 
 
 --
 View this message in context: 
http://lucene.472066.n3.nabble.com/How-can-i-pass-in-solr-dynamic-values-to-a-particular-field-tp4273107.html
 Sent from the Solr - User mailing list archive at Nabble.com.


How can i pass in solr dynamic values to a particular field

2016-04-27 Thread kavurupavan
Solr url :

Example:

http://localhost:8983/solr/searching/select?q=*:*=maths:25 AND science:
30

These 25 and 30 i need to pass dynamically values. How can I pass them as
part of a Solr URL? 
Please help me. Thanks in advance.



--
View this message in context: 
http://lucene.472066.n3.nabble.com/How-can-i-pass-in-solr-dynamic-values-to-a-particular-field-tp4273107.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Dergraded performance between Solr 4 and Solr 5

2016-04-27 Thread Jaroslaw Rozanski
Hi Eric,

Measuring running queries via JMeter. Values provided are rounded median
of multiple samples. Medians are just slightly better than 99th
percentile for all samples. 

Filter cache is useless as you mentioned; they are effectively not used.
There is auto-warming through cache autoWarm but no auto-warming
queries. 

Small experiment with passing =... seems not to make any difference
which would not be surprising given caches are barely involved.

Thanks for the suggestion on IO. After stopping indexing, the response
time barely changed on Solr 5. On Solr 4, with indexing running it is
still fast. So to effectively, Solr 4 under indexing load is faster than
idle Solr 5. Both set-ups have same heap size and available RAM on
machine (2x heap).

One other thing I am testing is issuing request to specific core, with
distrib=false. No significant improvements there.

Now what is interesting is that aforementioned query takes the same
amount of time to execute despite the number of documents found. 
- Whether it is 0 or 10k, it takes couple seconds on Solr 5.5.0.
- Meanwhile, on Solr 4.10.3, the response time is dependent on results
size. For Solr 4 no results returns in few ms and few seconds for couple
thousands of results. 
(query used {!cache=false}q=...)
  

Thanks,
Jarek

On Wed, 27 Apr 2016, at 04:39, Erick Erickson wrote:
> Well, the first question is always "how are you measuring this"?
> Measuring a few queries is almost completely uninformative,
> especially if the two systems have differing warmups. The only
> meaningful measurements are when throwing away the first bunch
> of queries then measuring a meaningful sample.
> 
> The setup you describe will be very sensitive to disk access
> with the autowarm of 1 second, so if there's much at all in
> the way of differences in I/O that would be a red flag.
> 
> From here on down doesn't really respond to the question, but
> I thought I'd mention it.
> 
> And you don't have to worry about disabling your fitlerCache since
> any filter query of the form fq=field:[mention NOW in here without
> rounding]
> never be re-used. So you might as well use {!cache=false}. Here's the
> background:
> 
> https://lucidworks.com/blog/2012/02/23/date-math-now-and-filter-queries/
> 
> And your soft commit is probably throwing out all the filter caches
> anyway.
> 
> I doubt you're doing any autowarming at all given the autocommit interval
> of 1 second and continuously updating documents and your reported
> query times. So you can pretty much forget what I said about throwing
> out your first N queries since you're (probably) not getting any benefit
> out of caches anyway.
> 
> On Tue, Apr 26, 2016 at 10:34 AM, Jaroslaw Rozanski
>  wrote:
> > Hi all,
> >
> > I am migrating a large Solr Cloud cluster from Solr 4.10 to Solr 5.5.0
> > and I observed big difference in query execution time.
> >
> > First a setup summary:
> > - multiple collections - 6
> > - each has multiple shards - 6
> > - same/similar hardware
> > - indexing tens of messages per second
> > - autoSoftCommit with 1s; hard commit few tens of seconds
> > - Java 8
> >
> > The query has following form: field1:[* TO NOW-14DAYS] OR (-field1:[* TO
> > *] AND field2:[* TO NOW-14DAYS])
> >
> > The fields field1 & field2 are of date type:
> >  > positionIncrementGap="0"/>
> >
> > As query (q={!cache=false}...)
> > Solr 4.10 -> 5s
> > Solr 5.5.0 -> 12s
> >
> > As filter query (q={!cache=false}*:*=..,)
> > Solr 4.10 -> 9s
> > Solr 5.5.0 -> 11s
> >
> > The query itself is bad and its optimization aside, I am wondering if
> > there is anything in Lucene/Solr that would have such an impact on query
> > execution time between versions.
> >
> > Originally I though it might be related to
> > https://issues.apache.org/jira/browse/SOLR-8251 and testing on small
> > scale proved that there is a difference in performance. However upgraded
> > version is already 5.5.0.
> >
> >
> >
> > Thanks,
> > Jarek
> >


Re: ANN: Solr puzzle: Magic Date

2016-04-27 Thread Reth RM
Yes, these can be practice/interview questions. But, considering the
specific example above, it seems like question is pertaining to plot
syntactically error(?);  it is not expected that developer/solr-user know
right syntax or commands. What could be interesting is, questions related
to cloud concepts, ranking concepts (tf-idf, bm25) or simple problem
statements that may ask how can this be implemented using solr's ootb
features/apis, and so on. If such are the upcoming puzzles question, I'm
sure they will be useful.
I liked the idea.


On Tue, Apr 26, 2016 at 5:49 PM, Alexandre Rafalovitch 
wrote:

> I am doing an experiment in teaching about Solr. I've created a Solr
> puzzle and want to know whether people would find it useful to do more
> of these. My mailing list have seen this already, but I would love the
> feedback from a wider Solr audience as well. Privately or on the list.
>
> The - first - puzzle is deceptively simple:
>
> --
> Given the following sequence of commands (for Solr 5.5 or 6.0):
>
> 1. bin/solr create_core -c puzzle_date
> 2. bin/post -c puzzle_date -type text/csv -d $'today\n2016-04-08'
> 3. curl http://localhost:8983/solr/puzzle_date/select?q=Fri
>
> --
> Would the result be:
>
> 1.Error in the command 1 for not providing a configuration directory
> 2.Error in the command 2 for missing a uniqueKey field
> 3.Error in the command 2 due to an incorrect date format
> 4.No records in the command 3 output
> 5.One record in the command 3 output
> --
>
> You can find the answer and full in-depth explanation at:
> http://blog.outerthoughts.com/2016/04/solr-5-puzzle-magic-date-answer/
>
> Again, what I am trying to understand is whether that's somehow useful
> to people and worth making time to create and write-up.
>
> Any feedback would be appreciated.
>
> Regards,
> Alex.
>
> 
> Newsletter and resources for Solr beginners and intermediates:
> http://www.solr-start.com/
>


Re: Build Java Package for required schema and solrconfig files field and configuration.

2016-04-27 Thread Reth RM
Hi Nitin,

If I understand correctly, you have configured suggest component in solr
instance. Solr instance is an independent java program and it will be
running on its own when you start and stop. You cannot package solr/suggest
component in your java application/project.

You can use SolrJ apis in your Java project and make use of those apis to
query solr to obtain suggestions :
http://www.solrtutorial.com/solrj-tutorial.html




On Wed, Apr 27, 2016 at 10:50 AM, Nitin Solanki 
wrote:

> Hello Everyone,
>  I have created a autosuggest using Solr suggester.
> I have added a field and field type in schema.xml and did some changes in
> /suggest request handler into solrconfig.xml.
> Now, I need to build a java package using those configuration which I need
> to plug into my current java project. I don't want to use CURL, I need my
> configuration as jar or java package. How can I do ? Not having experience
> of jar package too much. Any help please...
>
> Thanks,
> Nitin
>