Re: Terms not being indexed; not sure why

2017-05-18 Thread Erick Erickson
The Solr Admin UI has the "LukeRequestHandler" behind it not Luke. The
Luke stand-alone program is a different beast, although the
LukeRequestHandler is modeled after _some_ features of Luke.

The naming lends itself to some confusion for sure.

Best,
Erick

On Thu, May 18, 2017 at 7:11 PM, Rick Leir  wrote:
> Erick,
>
> In the Solr Admin UI, click on tabs and watch the log. The Analysis tab
> seems to have Luke behind it, and one other. But the screen layout seems
> different from the stand-alone Luke. I plan to give Lukestandalone a try
> soon. cheers -- Rick
>
>
> On 2017-05-16 10:44 AM, Erick Erickson wrote:
>>
>> if
>> you really want to examine the index, get a copy of Luke, although I'm
>> not sure how up to date it is.
>
>


Re: Terms not being indexed; not sure why

2017-05-18 Thread Rick Leir

Erick,

In the Solr Admin UI, click on tabs and watch the log. The Analysis tab 
seems to have Luke behind it, and one other. But the screen layout seems 
different from the stand-alone Luke. I plan to give Lukestandalone a try 
soon. cheers -- Rick



On 2017-05-16 10:44 AM, Erick Erickson wrote:

if
you really want to examine the index, get a copy of Luke, although I'm
not sure how up to date it is.




Re: Slow Bulk InPlace DocValues updates

2017-05-18 Thread Damien Kamerman
Adding more shards will scale your writes.

On 18 May 2017 at 20:08, Dan .  wrote:

> Hi,
>
> -Solr 6.5.1
> -SSD disk
> -23M docs index 64G single shard
>
> I'm trying to do around 4M in-place docValue updates to a collection
> (single shard or around 23M docs) [these are ALL in-place updates]
>
>  I can add the updates in around 7mins, but flushing to disk takes around
> 40mins! I've been able to add the updates quickly by adding:
>
> 
> 4000
>   
>
> autoSoftCommit/autoCommit currently disabled.
>
> From the thread dump I see that the flush is in a single thread and
> extremely slow. Dump below, the culprit seems to be [
>
>-
>org.apache.lucene.index.BufferedUpdatesStream.applyDocValuesUpdates​(
> BufferedUpdatesStream.java:666)]
>
> :
>
>
>-
>org.apache.lucene.codecs.blocktree.SegmentTermsEnum.
> pushFrame​(SegmentTermsEnum.java:256)
>-
>org.apache.lucene.codecs.blocktree.SegmentTermsEnum.
> pushFrame​(SegmentTermsEnum.java:248)
>-
>org.apache.lucene.codecs.blocktree.SegmentTermsEnum.
> seekExact​(SegmentTermsEnum.java:538)
>
>
>
>-
>org.apache.lucene.index.BufferedUpdatesStream.applyDocValuesUpdates​(
> BufferedUpdatesStream.java:666)
>-
>org.apache.lucene.index.BufferedUpdatesStream.
> applyDocValuesUpdatesList​(BufferedUpdatesStream.java:612)
>-
>org.apache.lucene.index.BufferedUpdatesStream.applyDeletesAndUpdates​(
> BufferedUpdatesStream.java:269)
>-
>org.apache.lucene.index.IndexWriter.applyAllDeletesAndUpdates​(
> IndexWriter.java:3454)
>-
>org.apache.lucene.index.IndexWriter.applyDeletesAndPurge​(
> IndexWriter.java:4990)
>-
>org.apache.lucene.index.DocumentsWriter$ApplyDeletesEvent.process​(
> DocumentsWriter.java:717)
>-
>org.apache.lucene.index.IndexWriter.processEvents​(
> IndexWriter.java:5040)
>-
>org.apache.lucene.index.IndexWriter.processEvents​(
> IndexWriter.java:5031)
>-
>org.apache.lucene.index.IndexWriter.updateDocValues​(
> IndexWriter.java:1731)
>-
>org.apache.solr.update.DirectUpdateHandler2.updateDocOrDocValues​(
> DirectUpdateHandler2.java:911)
>-
>org.apache.solr.update.DirectUpdateHandler2.doNormalUpdate​(
> DirectUpdateHandler2.java:302)
>-
>org.apache.solr.update.DirectUpdateHandler2.addDoc0​(
> DirectUpdateHandler2.java:239)
>-
>org.apache.solr.update.DirectUpdateHandler2.addDoc​(
> DirectUpdateHandler2.java:194)
>
>
> I think this is related to
> SOLR-6838 [https://issues.apache.org/jira/browse/SOLR-6838]
> and
> LUCENE-6161 [https://issues.apache.org/jira/browse/LUCENE-6161]
>
> I need to make the flush faster, to complete the update quicker. Has anyone
> a workaround or have any suggestions?
>
> Many thanks,
> Dan
>


Data Dir from a core with init-failure

2017-05-18 Thread Shashank Pedamallu
Hi all,

Question:
I would like to know what is the reliable way to fetch dataDir (and 
instanceDir) of a core that has an init failure.

Trials made:
I used the admin api to get status:
http://localhost:8983/solr/admin/cores?action=STATUS==json=true
But I observed that, it does not always give me the information.

Steps followed:
Step 1: On a healthy core, remove some index file and restart Solr
Step 2: Invoke status api. Response was:
{
  "responseHeader":{
"status":0,
"QTime":5},
  
"initFailures":{"":"org.apache.solr.common.SolrException:org.apache.solr.common.SolrException:
 Error opening new searcher"},
  "status":{
"":{
  "name":"",
  "instanceDir":"",
  "dataDir":"",
  "config":"solrconfig.xml",
  "schema":"schema.xml",
  "isLoaded":"false"}}}
Step 3: Replace the removed file and restart Solr. Everything works as it is.
Step 4: Unload the core (without deleting indexDir or dataDir). api: 
http://localhost:8983/solr/admin/cores?action=UNLOAD=
Step 5: Create the core (with same indexDir and dataDir). api: 
http://localhost:8983/solr/admin/cores?action=CREATE==path/to/dir=config_file_name.xml=data
Step 6: Repeat Step 1. I.e., remove some index file and restart Solr
Step 7: Repeat Step 2. I.e, Invoke status api. Response was:
{
  "responseHeader":{
"status":0,
"QTime":0},
  
"initFailures":{"":"org.apache.solr.common.SolrException:org.apache.solr.common.SolrException:
 Error opening new searcher"},
  "status":{
"":{}}}

Like you see, the response of status api from Step 7 is different from Step 2. 
Is there a reliable way to get dataDir of a core in the presence of 
init-failure?

Thanks,
Shashank Pedamallu


Upgrading Similarity from Solr 4.x to 6.x

2017-05-18 Thread Lynn Monson
I have a question about moving a trivial (?) Similarity class from version
4.x to 6.x

I have queries along these lines: field1:somevalue^0.55
field2:anothervalue^1.4
The score for a document is simply the sum of weights. A hit on field1
alone scores 0.55. Field2 alone scores 1.4. Both fields scores 1.95

The codebase I am dealing with is based on 4.x, and has a similarity class
like this:
--
public class MySimilarity extends Similarity {

   public long computeNorm(FieldInvertState state) {
  return 1;
   }

   public SimWeight computeWeight(float queryBoost, CollectionStatistics
collectionStats, TermStatistics... termStats) {
  return new MySimWeight(queryBoost);
   }

   public SimScorer simScorer(SimWeight weight, AtomicReaderContext
context) throws IOException {
  return new MySimScorer((MySimWeight)weight);
   }

   public static class MySimScorer extends Similarity.SimScorer {
  public MySimWeight weight;

  ...

 public float score(int doc, float freq) {
return freq > 0.0 ? weight.boost : 0.0f; // don't count >1
occurrence
 }

 public float computeSlopFactor(int distance) {
return 1.0f / (distance + 1);
 }

 public float computePayloadFactor(int doc, int start, int end,
BytesRef payload) {
return 1.0f;
 }

  }

public static class MySimWeight extends SimWeight {
  public float topLevelBoost;
  public float queryBoost;
  public float boost;

 public MySimWeight(float queryBoost) {
this.queryBoost = queryBoost;
 }

 public float getValueForNormalization() {
return 1.0f;
 }


 public void normalize(float queryNorm, float topLevelBoost) {
this.topLevelBoost = topLevelBoost;
this.boost = this.queryBoost * topLevelBoost;
 }
   }
}
--

How would I adapt this for Solr 6.x where the boosting factor no longer
appears in the Similarity methods?


Custom RequestHandler with the Solr api (solrj) that makes a query call back to the server

2017-05-18 Thread Jack Java
Hi,
I'm looking for some advice on specific issue that is holding us back.

I'm trying to create a custom RequestHandler with the Solr api (solrj) that
makes a query call back to the server.

I'm not finding any good, run-able examples on-line. Possibly I'm
approaching this wrong. Any advice would be appreciated.

All I'm trying to do is query the data to see what are all of the values
used by some specific fields and then construct a XML response to send back
to the user. To get the information that I need, I have to do more than one
query and process some of the data returned. That is why I am using a
RequestHandler.

I know how to write these queries to get this information that is need.
They work fine individually but the issue is I'm having trouble using the
objects available within the RequestHandler to make the call back to the
server. The response is always 0 documents no matter how simple/broad your
query is. "q", "*:*"

Here is one of my attempts at coding this.

 public void handleRequestBody(SolrQueryRequest req,SolrQueryResponse rsp)
throws Exception {
  SolrCore solrServerCore= req.getCore();
  SolrRequestHandler handler =solrServerCore.getRequestHandler("/select");
  ModifiableSolrParams params =new ModifiableSolrParams();
  params1.add("q","*:*");
  SolrQueryRequest req= new LocalSolrQueryRequest(solrServerCore,params);
  SolrQueryResponse rsp= new SolrQueryResponse();;
  solrServerCore.execute(handler1,req, rsp);
  //!!!Not returning a structured response
  System.out.println(rsp.toString());
  System.out.println(rsp.getReturnFields());
  System.out.println(rsp.getValues().toString());




Re: Custom RequestHandler with the Solr api (solrj) that makes a query call back to the server

2017-05-18 Thread Chris Yee
Reformatting for readability:

Hi,
I'm looking for some advice on specific issue that is holding us back.

I'm trying to create a custom RequestHandler with the Solr api (solrj) that
makes a query call back to the server.

I'm not finding any good, run-able examples of this on-line. Possibly I'm
approaching this wrong. Any advice would be appreciated.

All I'm trying to do is query the data to see what are all of the values
used by some specific fields and then construct a XML response to send back
to the user. To get the information that I need, I have to do more than one
query and process some of the data returned. That is why I am using a
RequestHandler.

I know how to write these queries to get this information that is need.
They work fine individually but the issue is I'm having trouble using the
objects available within the RequestHandler to make the call back to the
server. The response is always 0 documents no matter how simple/broad your
query is. "q", "*:*"

Here is one of my attempts to coding this.

 public void handleRequestBody(SolrQueryRequest req,SolrQueryResponse rsp)
throws Exception {
  SolrCore solrServerCore= req.getCore();
  SolrRequestHandler handler =solrServerCore.getRequestHandler("/select");
  ModifiableSolrParams params =new ModifiableSolrParams();
  params1.add("q","*:*");
  SolrQueryRequest req= new LocalSolrQueryRequest(solrServerCore,params);
  SolrQueryResponse rsp= new SolrQueryResponse();;
  solrServerCore.execute(handler1,req, rsp);
  //!!!Not returning a structured response
  System.out.println(rsp.toString());
  System.out.println(rsp.getReturnFields());
  System.out.println(rsp.getValues().toString());

On Thu, May 18, 2017 at 2:37 PM, Jack Java 
wrote:

>
> Hi,I'm looking for some advice on specific issue that is holding usback.
> I'mtrying to create a custom RequestHandler with the Solr api (solrj)that
> makes a query call back to the server.
> I'mnot finding any good, run-able examples of this on-line. Possibly
> I'mapproaching this wrong. Any advice would be appreciated.
> AllI'm trying to do is query the data to see what are all of the
> valuesused by some specific fields and then construct a XML response tosend
> back to the user. To get the information that I need, I have todo more than
> one query and process some of the data returned. That iswhy I am using a
> RequestHandler.
> Iknow how to write these queries to get this information that is need.They
> work fine individually but the issue is I'm having trouble usingthe objects
> available within the RequestHandler to make the call backto the server. The
> response is always 0 documents no matter howsimple/broad your query is.
> "q", "*:*"Hereis one of my attempts to coding this.
>
>  publicvoidhandleRequestBody(SolrQueryRequest req,SolrQueryResponse rsp)
> throwsException {
>  SolrCoresolrServerCore= req.getCore(); SolrRequestHandlerhandler
> =solrServerCore.getRequestHandler("/select");  ModifiableSolrParamsparams
> =newModifiableSolrParams(); params1.add("q","*:*");  SolrQueryRequest req=
> newLocalSolrQueryRequest(solrServerCore,params); SolrQueryResponse rsp=
> newSolrQueryResponse();; solrServerCore.execute(handler1,req, rsp);
> //!!!Not returning a structured response System.out.println(rsp.toString());
> System.out.println(rsp.getReturnFields()); System.out.println(rsp.
> getValues().toString());
>
>
>


Custom RequestHandler with the Solr api (solrj) that makes a query call back to the server

2017-05-18 Thread Jack Java

Hi,I'm looking for some advice on specific issue that is holding usback. 
I'mtrying to create a custom RequestHandler with the Solr api (solrj)that makes 
a query call back to the server.
I'mnot finding any good, run-able examples of this on-line. Possibly 
I'mapproaching this wrong. Any advice would be appreciated.
AllI'm trying to do is query the data to see what are all of the valuesused by 
some specific fields and then construct a XML response tosend back to the user. 
To get the information that I need, I have todo more than one query and process 
some of the data returned. That iswhy I am using a RequestHandler.
Iknow how to write these queries to get this information that is need.They work 
fine individually but the issue is I'm having trouble usingthe objects 
available within the RequestHandler to make the call backto the server. The 
response is always 0 documents no matter howsimple/broad your query is. "q", 
"*:*"Hereis one of my attempts to coding this.

 publicvoidhandleRequestBody(SolrQueryRequest req,SolrQueryResponse rsp) 
throwsException {
 SolrCoresolrServerCore= req.getCore(); SolrRequestHandlerhandler 
=solrServerCore.getRequestHandler("/select");  ModifiableSolrParamsparams 
=newModifiableSolrParams(); params1.add("q","*:*");  SolrQueryRequest req= 
newLocalSolrQueryRequest(solrServerCore,params); SolrQueryResponse rsp= 
newSolrQueryResponse();; solrServerCore.execute(handler1,req, rsp);  //!!!Not 
returning a structured response System.out.println(rsp.toString()); 
System.out.println(rsp.getReturnFields()); 
System.out.println(rsp.getValues().toString());




Re: knowing which fields were successfully hit

2017-05-18 Thread Dan .
Hi,

What about a function query in the field list.

e.g.

for:
field.x
field.y

http://?
q={!type=dismax qf='field.x field.y' v=$qq)
=solr rocks
=id,score,x_score:query({!type=dismax qf='field.x'
v=$qq}),y_score:query({!type=dismax qf='field.y' v=$qq})

Hit is x_score or y_score > 0
Note that you might get a score for both depending on data.

Dan

On 17 May 2017 at 12:06, John Blythe  wrote:

> hey erik, totally unaware of those two. we're able to retrieve metadata
> about the query itself that way?
>
> --
> *John Blythe*
> Product Manager & Lead Developer
>
> 251.605.3071 | j...@curvolabs.com
> www.curvolabs.com
>
> 58 Adams Ave
> Evansville, IN 47713
>
> On Tue, May 16, 2017 at 1:54 PM, Erik Hatcher 
> wrote:
>
> > Is this the equivalent of facet.query’s?   or maybe rather, group.query?
> >
> > Erik
> >
> >
> >
> > > On May 16, 2017, at 1:16 PM, Dorian Hoxha 
> > wrote:
> > >
> > > Something like elasticsearch named-queries, right
> > > https://www.elastic.co/guide/en/elasticsearch/reference/
> > current/search-request-named-queries-and-filters.html
> > > ?
> > >
> > >
> > > On Tue, May 16, 2017 at 7:10 PM, John Blythe 
> wrote:
> > >
> > >> sorry for the confusion. as in i received results due to matches on
> > field x
> > >> vs. field y.
> > >>
> > >> i've gone w a highlighting solution for now. the fact that it requires
> > >> field storage isn't yet prohibitive for me, so can serve well for now.
> > open
> > >> to any alternative approaches all the same
> > >>
> > >> thanks-
> > >>
> > >> --
> > >> *John Blythe*
> > >> Product Manager & Lead Developer
> > >>
> > >> 251.605.3071 | j...@curvolabs.com
> > >> www.curvolabs.com
> > >>
> > >> 58 Adams Ave
> > >> Evansville, IN 47713
> > >>
> > >> On Tue, May 16, 2017 at 11:37 AM, David Hastings <
> > >> hastings.recurs...@gmail.com> wrote:
> > >>
> > >>> what do you mean "hit?" As in the user clicked it?
> > >>>
> > >>> On Tue, May 16, 2017 at 11:35 AM, John Blythe 
> > >> wrote:
> > >>>
> >  hey all. i'm sending data out that could represent a purchased item
> or
> > >> a
> >  competitive alternative. when the results are returned i'm needing
> to
> > >>> know
> >  which of the two were hit so i can serve up the *other*.
> > 
> >  i can make a blunt instrument in the application layer to simply
> look
> > >>> for a
> >  match between the queried terms and the resulting fields, but the
> > >> problem
> >  of fuzzy matching and some of the special analysis being done to get
> > >> the
> >  hits will be for naught.
> > 
> >  cursory googling landed me at a similar discussion that suggested
> > using
> > >>> hit
> >  highlighting or retrieving the debuggers explain data to sort
> through.
> > 
> >  is there another, more efficient means or are these the two tools in
> > >> the
> >  toolbox?
> > 
> >  thanks!
> > 
> > >>>
> > >>
> >
> >
>


Re: cursorMark value causes Request-URI Too Long excpetion

2017-05-18 Thread gigo314
Thanks a lot. I didn't think of switching to
application/x-www-form-urlencoded content type. It solved my issue :)



--
View this message in context: 
http://lucene.472066.n3.nabble.com/cursorMark-value-causes-Request-URI-Too-Long-excpetion-tp4335472p4335691.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Solr Atomic Document update Conditional

2017-05-18 Thread Dan .
Hi,

Why not write a custom UpdateRequestProcessor  if it's a special case
from the norm then place it in it's own chain and do update like http://
.../update/json?=

Cheers,
Dan

On 18 May 2017 at 09:05, Aman Deep Singh  wrote:

> Hi ,
> Is their any way to do the SOLR atomic update based on some condition
> Suppose in my SOLR schema i have some fields
>
>1. field1
>2. field2
>3. field1_timestamp
>4. field2_timestamp
>
> Now i have to update value of field1 only if field1_timestamp is less then
> the provided timestamp
> I found a SOLR thread for same question
> http://lucene.472066.n3.nabble.com/Conditional-atomic-
> update-td4277224.html
> but it doesn't contains any solution
>
>
> Thanks,
> Aman Deep Singh
>


Re: Solr Atomic Document update Conditional

2017-05-18 Thread Aman Deep Singh
Hi Shawn,
Solr optimistic concurrency is worked fine only for 1 field
But in my case two or more field can be updated at a same time
But one field can not be updated if its corresponding timestamp is greater
than request time


On 18-May-2017 6:15 PM, "Shawn Heisey"  wrote:

On 5/18/2017 2:05 AM, Aman Deep Singh wrote:
> Is their any way to do the SOLR atomic update based on some condition
> Suppose in my SOLR schema i have some fields
>
>1. field1
>2. field2
>3. field1_timestamp
>4. field2_timestamp
>
> Now i have to update value of field1 only if field1_timestamp is less then
> the provided timestamp
> I found a SOLR thread for same question
> http://lucene.472066.n3.nabble.com/Conditional-atomic-
update-td4277224.html
> but it doesn't contains any solution

I have never heard of anything like this.

If you want to write your own custom update processor that looks for the
condition and removes the appropriate atomic update command(s) from the
request, or even completely aborts the request, you can certainly do
so.  That seems to be what was suggested by the author of the thread
that you referenced.

What you have described sounds a little bit like optimistic concurrency,
something Solr supports out of the box.  The feature isn't identical to
what you described, but maybe you can adapt it.  Automatically assigned
_version_ values are derived from a java timestamp.

http://yonik.com/solr/optimistic-concurrency/

Thanks,
Shawn


solr 5.5.2 bug in edismax pf2 when boosting term

2017-05-18 Thread elisabeth benoit
Hello,

I am using solr 5.5.2.

I am trying to give a lower score to frequent words in query.

The only way I've found so far is to do like

q=avenue^0.1 de champaubert village suisse 75015 paris

where avenue is a frequent word.

The problem is I'm using edismax, and when I add ^0.1 to avenue, it is not
considered anymore in pf2.

I am looking for a work around, or another way to give lower score to
frequent words in solr.

If anyone could help it would be great.

Elisabeth


Re: Solr Atomic Document update Conditional

2017-05-18 Thread Shawn Heisey
On 5/18/2017 2:05 AM, Aman Deep Singh wrote:
> Is their any way to do the SOLR atomic update based on some condition
> Suppose in my SOLR schema i have some fields
>
>1. field1
>2. field2
>3. field1_timestamp
>4. field2_timestamp
>
> Now i have to update value of field1 only if field1_timestamp is less then
> the provided timestamp
> I found a SOLR thread for same question
> http://lucene.472066.n3.nabble.com/Conditional-atomic-update-td4277224.html
> but it doesn't contains any solution

I have never heard of anything like this.

If you want to write your own custom update processor that looks for the
condition and removes the appropriate atomic update command(s) from the
request, or even completely aborts the request, you can certainly do
so.  That seems to be what was suggested by the author of the thread
that you referenced.

What you have described sounds a little bit like optimistic concurrency,
something Solr supports out of the box.  The feature isn't identical to
what you described, but maybe you can adapt it.  Automatically assigned
_version_ values are derived from a java timestamp.

http://yonik.com/solr/optimistic-concurrency/

Thanks,
Shawn



Re: cursorMark value causes Request-URI Too Long excpetion

2017-05-18 Thread Shawn Heisey
On 5/18/2017 1:52 AM, gigo314 wrote:
> Thanks, that was my assumption as well that all parameters should are
> supported by both GET and POST. However, when using JSON API I keep getting
> 400 error code:
>
> /Request/:
> {"query":"*","cursorMark":"*","sort":"id asc"}
>
> /Response/:
> {"responseHeader":{"status":400,"QTime":0,"params":{"fl":"id","json":"{\"query\":\"*\",\"cursorMark\":\"*\",\"sort\":\"id
> asc\"}","rows":"1","wt":"json"}},"error":{"metadata":["error-class","org.apache.solr.common.SolrException","root-error-class","org.apache.solr.common.SolrException"],"msg":"*Unknown
> top-level key in JSON request : cursorMark*","code":400}}
You still haven't told us how you are sending requests to Solr, whether
it's being constructed manually and sent with an HTTP module in a
programming language, with curl, or whether you are using a Solr
library, and if so, what language it's for.

I have no idea how to use the JSON API for queries.  I also have no idea
what parameters it supports.  Based on the documentation page, it
doesn't use standard parameters -- the example has "query" and "filter"
where the standard URL parameters for these are "q" and "fq" ... so it
is entirely possible that it cannot support arbitrary parameters like
cursorMark.  The error message says that cursorMark is an unknown top
level JSON key, which supports this idea.

Speaking generally, and not language specific:

You would just put the parameters in the request body, like they would
appear on the URL.  I think to use this, the Content-Type header would
be "|application/x-www-form-urlencoded|".

I *think* that each parameter (not counting the & characters between
them) should be run through a URL encoding routine before being added to
the body.  It's possible that whatever library you're using would do
this automatically, but unless you know for sure, don't assume that.

This might be the sort of thing you need to send in the body:

q%3d*%3a*%3did%3dAoEpVkRCREIxQTE2

In that URL encoded string, an equal sign is %3d and a colon is %3a, so
the decoded string that the servlet container running Solr would see is
this:

q=*:*=id=AoEpVkRCREIxQTE2

When you put URL parameters into a browser, the browser automatically
does the URL encoding for you, and doesn't ever show you the encoded
version.

Thanks,
Shawn



Slow Bulk InPlace DocValues updates

2017-05-18 Thread Dan .
Hi,

-Solr 6.5.1
-SSD disk
-23M docs index 64G single shard

I'm trying to do around 4M in-place docValue updates to a collection
(single shard or around 23M docs) [these are ALL in-place updates]

 I can add the updates in around 7mins, but flushing to disk takes around
40mins! I've been able to add the updates quickly by adding:


4000
  

autoSoftCommit/autoCommit currently disabled.

>From the thread dump I see that the flush is in a single thread and
extremely slow. Dump below, the culprit seems to be [

   -
   
org.apache.lucene.index.BufferedUpdatesStream.applyDocValuesUpdates​(BufferedUpdatesStream.java:666)]

:


   -
   
org.apache.lucene.codecs.blocktree.SegmentTermsEnum.pushFrame​(SegmentTermsEnum.java:256)
   -
   
org.apache.lucene.codecs.blocktree.SegmentTermsEnum.pushFrame​(SegmentTermsEnum.java:248)
   -
   
org.apache.lucene.codecs.blocktree.SegmentTermsEnum.seekExact​(SegmentTermsEnum.java:538)



   -
   
org.apache.lucene.index.BufferedUpdatesStream.applyDocValuesUpdates​(BufferedUpdatesStream.java:666)
   -
   
org.apache.lucene.index.BufferedUpdatesStream.applyDocValuesUpdatesList​(BufferedUpdatesStream.java:612)
   -
   
org.apache.lucene.index.BufferedUpdatesStream.applyDeletesAndUpdates​(BufferedUpdatesStream.java:269)
   -
   
org.apache.lucene.index.IndexWriter.applyAllDeletesAndUpdates​(IndexWriter.java:3454)
   -
   
org.apache.lucene.index.IndexWriter.applyDeletesAndPurge​(IndexWriter.java:4990)
   -
   
org.apache.lucene.index.DocumentsWriter$ApplyDeletesEvent.process​(DocumentsWriter.java:717)
   -
   org.apache.lucene.index.IndexWriter.processEvents​(IndexWriter.java:5040)
   -
   org.apache.lucene.index.IndexWriter.processEvents​(IndexWriter.java:5031)
   -
   org.apache.lucene.index.IndexWriter.updateDocValues​(IndexWriter.java:1731)
   -
   
org.apache.solr.update.DirectUpdateHandler2.updateDocOrDocValues​(DirectUpdateHandler2.java:911)
   -
   
org.apache.solr.update.DirectUpdateHandler2.doNormalUpdate​(DirectUpdateHandler2.java:302)
   -
   
org.apache.solr.update.DirectUpdateHandler2.addDoc0​(DirectUpdateHandler2.java:239)
   -
   
org.apache.solr.update.DirectUpdateHandler2.addDoc​(DirectUpdateHandler2.java:194)


I think this is related to
SOLR-6838 [https://issues.apache.org/jira/browse/SOLR-6838]
and
LUCENE-6161 [https://issues.apache.org/jira/browse/LUCENE-6161]

I need to make the flush faster, to complete the update quicker. Has anyone
a workaround or have any suggestions?

Many thanks,
Dan


Solr Atomic Document update Conditional

2017-05-18 Thread Aman Deep Singh
Hi ,
Is their any way to do the SOLR atomic update based on some condition
Suppose in my SOLR schema i have some fields

   1. field1
   2. field2
   3. field1_timestamp
   4. field2_timestamp

Now i have to update value of field1 only if field1_timestamp is less then
the provided timestamp
I found a SOLR thread for same question
http://lucene.472066.n3.nabble.com/Conditional-atomic-update-td4277224.html
but it doesn't contains any solution


Thanks,
Aman Deep Singh


Re: cursorMark value causes Request-URI Too Long excpetion

2017-05-18 Thread gigo314
Thanks, that was my assumption as well that all parameters should are
supported by both GET and POST. However, when using JSON API I keep getting
400 error code:

/Request/:
{"query":"*","cursorMark":"*","sort":"id asc"}

/Response/:
{"responseHeader":{"status":400,"QTime":0,"params":{"fl":"id","json":"{\"query\":\"*\",\"cursorMark\":\"*\",\"sort\":\"id
asc\"}","rows":"1","wt":"json"}},"error":{"metadata":["error-class","org.apache.solr.common.SolrException","root-error-class","org.apache.solr.common.SolrException"],"msg":"*Unknown
top-level key in JSON request : cursorMark*","code":400}}

Also I was not able to find any examples of cursorMark being used in POST,
neither on the  wiki page
  , nor
in  the reference guide

 
.
Am I using wrong parameter name?

Thanks!




--
View this message in context: 
http://lucene.472066.n3.nabble.com/cursorMark-value-causes-Request-URI-Too-Long-excpetion-tp4335472p4335590.html
Sent from the Solr - User mailing list archive at Nabble.com.


RE: Performance warning: Overlapping onDeskSearchers=2 solr

2017-05-18 Thread Srinivas Kashyap
Hi,



We have not set the autosoftcommit in solrcofig.xml. The only commit we are 
doing is through DIH(assuming it commits after the import).



Also we have written timely schedulers to check if any records/documents is 
updated in database and to trigger the re-index of solr on those updated 
documents.



Below are some more config details in solrconfig.xml











20

200

false

2



Thanks and Regards,

Srinivas Kashyap



-Original Message-
From: Erick Erickson [mailto:erickerick...@gmail.com]
Sent: 17 May 2017 08:51 PM
To: solr-user >
Subject: Re: Performance warning: Overlapping onDeskSearchers=2 solr



Also, what is your autoSoftCommit setting? That also opens up a new searcher.



On Wed, May 17, 2017 at 8:15 AM, Jason Gerlowski 
> wrote:

> Hey Shawn, others.

>

> This is a pitfall that Solr users seem to run into with some

> frequency.  (Anecdotally, I've bookmarked the Lucidworks article you

> referenced because I end up referring people to it often enough.)

>

> The immediate first advice when someone encounters these

> onDeckSearcher error messages is to examine their commit settings.  Is

> there any other possible cause for those messages?  If not, can we

> consider changing the log/exception error message to be more explicit

> about the cause?

>

> A strawman new message could be: "Performance warning: Overlapping

> onDeskSearchers=2; consider reducing commit frequency if performance

> problems encountered"

>

> Happy to create a JIRA/patch for this; just wanted to get some

> feedback first in case there's an obvious reason the messages don't

> get explicit about the cause.

>

> Jason

>

> On Wed, May 17, 2017 at 8:49 AM, Shawn Heisey 
> > wrote:

>> On 5/17/2017 5:57 AM, Srinivas Kashyap wrote:

>>> We are using Solr 5.2.1 version and are currently experiencing below 
>>> Warning in Solr Logging Console:

>>>

>>> Performance warning: Overlapping onDeskSearchers=2

>>>

>>> Also we encounter,

>>>

>>> org.apache.solr.common.SolrException: Error opening new searcher. exceeded 
>>> limit of maxWarmingSearchers=2, try again later.

>>>

>>>

>>> The reason being, we are doing mass update on our application and solr 
>>> experiencing the higher loads at times. Data is being indexed using DIH(sql 
>>> queries).

>>>

>>> In solrconfig.xml below is the code.

>>>

>>> 

>>>

>>> Should we be uncommenting the above lines and try to avoid this error? 
>>> Please help me.

>>

>> This warning means that you are committing so frequently that there

>> are already two searchers warming when you start another commit.

>>

>> DIH does a commit exactly once -- at the end of the import.  One import will 
>> not cause the warning message you're seeing, so if there is one import 
>> happening at a time, either you are sending explicit commit requests during 
>> the import, or you have autoSoftCommit enabled with values that are far too 
>> small.

>>

>> You should definitely have autoCommit configured, but I would remove

>> maxDocs and set maxTime to something like 6 -- one minute.  The

>> autoCommit should also set openSearcher to false.  This kind of

>> commit will not make new changes visible, but it will start a new

>> transaction log frequently.

>>

>>

>>  6

>>  false

>>

>>

>> An automatic commit (soft or hard) with a one second interval is going to 
>> cause that warning you're seeing.

>>

>> https://lucidworks.com/understanding-transaction-logs-softcommit-and-

>> commit-in-sorlcloud/

>>

>> Thanks,

>> Shawn

>>





DISCLAIMER: E-mails and attachments from Bamboo Rose, LLC are confidential. If 
you are not the intended recipient, please notify the sender immediately by 
replying to the e-mail, and then delete it without making copies or using it in 
any way. No representation is made that this email or any attachments are free 
of viruses. Virus scanning is recommended and is the responsibility of the 
recipient.


Thanks and Regards,
Srinivas Kashyap
Senior Software Engineer
"GURUDAS HERITAGE"
100 Feet Ring Road,
Kadirenahalli,
Banashankari 2nd Stage,
Bangalore-560070
P:  973-986-6105
Bamboo Rose
The only B2B marketplace powered by proven trade engines.
www.BambooRose.com

Make Retail. Fun. Connected. Easier. Smarter. Together. Better.



DISCLAIMER: E-mails and attachments from Bamboo Rose, LLC are confidential. If 
you are not the intended recipient, please notify the sender immediately by 
replying to the e-mail, and then delete it without making copies or using it in 
any way. No representation is made that this email or any attachments are free 
of viruses. Virus scanning is recommended and is the responsibility of the 
recipient.


RE: Performance warning: Overlapping onDeskSearchers=2 solr

2017-05-18 Thread Srinivas Kashyap
Hi,

We have not set the autosoftcommit in solrcofig.xml. The only commit we are 
doing is through DIH(assuming it commits after the import).

Also we have written timely schedulers to check if any records/documents is 
updated in database and to trigger the re-index of solr on those updated 
documents.

Below are some more config details in solrconfig.xml





20
200
false
2

Thanks and Regards,
Srinivas Kashyap

-Original Message-
From: Erick Erickson [mailto:erickerick...@gmail.com]
Sent: 17 May 2017 08:51 PM
To: solr-user 
Subject: Re: Performance warning: Overlapping onDeskSearchers=2 solr

Also, what is your autoSoftCommit setting? That also opens up a new searcher.

On Wed, May 17, 2017 at 8:15 AM, Jason Gerlowski 
> wrote:
> Hey Shawn, others.
>
> This is a pitfall that Solr users seem to run into with some
> frequency.  (Anecdotally, I've bookmarked the Lucidworks article you
> referenced because I end up referring people to it often enough.)
>
> The immediate first advice when someone encounters these
> onDeckSearcher error messages is to examine their commit settings.  Is
> there any other possible cause for those messages?  If not, can we
> consider changing the log/exception error message to be more explicit
> about the cause?
>
> A strawman new message could be: "Performance warning: Overlapping
> onDeskSearchers=2; consider reducing commit frequency if performance
> problems encountered"
>
> Happy to create a JIRA/patch for this; just wanted to get some
> feedback first in case there's an obvious reason the messages don't
> get explicit about the cause.
>
> Jason
>
> On Wed, May 17, 2017 at 8:49 AM, Shawn Heisey 
> > wrote:
>> On 5/17/2017 5:57 AM, Srinivas Kashyap wrote:
>>> We are using Solr 5.2.1 version and are currently experiencing below 
>>> Warning in Solr Logging Console:
>>>
>>> Performance warning: Overlapping onDeskSearchers=2
>>>
>>> Also we encounter,
>>>
>>> org.apache.solr.common.SolrException: Error opening new searcher. exceeded 
>>> limit of maxWarmingSearchers=2, try again later.
>>>
>>>
>>> The reason being, we are doing mass update on our application and solr 
>>> experiencing the higher loads at times. Data is being indexed using DIH(sql 
>>> queries).
>>>
>>> In solrconfig.xml below is the code.
>>>
>>> 
>>>
>>> Should we be uncommenting the above lines and try to avoid this error? 
>>> Please help me.
>>
>> This warning means that you are committing so frequently that there
>> are already two searchers warming when you start another commit.
>>
>> DIH does a commit exactly once -- at the end of the import.  One import will 
>> not cause the warning message you're seeing, so if there is one import 
>> happening at a time, either you are sending explicit commit requests during 
>> the import, or you have autoSoftCommit enabled with values that are far too 
>> small.
>>
>> You should definitely have autoCommit configured, but I would remove
>> maxDocs and set maxTime to something like 6 -- one minute.  The
>> autoCommit should also set openSearcher to false.  This kind of
>> commit will not make new changes visible, but it will start a new
>> transaction log frequently.
>>
>>
>>  6
>>  false
>>
>>
>> An automatic commit (soft or hard) with a one second interval is going to 
>> cause that warning you're seeing.
>>
>> https://lucidworks.com/understanding-transaction-logs-softcommit-and-
>> commit-in-sorlcloud/
>>
>> Thanks,
>> Shawn
>>


DISCLAIMER: E-mails and attachments from Bamboo Rose, LLC are confidential. If 
you are not the intended recipient, please notify the sender immediately by 
replying to the e-mail, and then delete it without making copies or using it in 
any way. No representation is made that this email or any attachments are free 
of viruses. Virus scanning is recommended and is the responsibility of the 
recipient.

  

DISCLAIMER: E-mails and attachments from Bamboo Rose, LLC are confidential. If 
you are not the intended recipient, please notify the sender immediately by 
replying to the e-mail, and then delete it without making copies or using it in 
any way. No representation is made that this email or any attachments are free 
of viruses. Virus scanning is recommended and is the responsibility of the 
recipient.


Re: setup solrcloud from scratch vie web-ui

2017-05-18 Thread Thomas Porschberg

> Shawn Heisey  hat am 17. Mai 2017 um 15:10 geschrieben:
> 
> 
> On 5/17/2017 6:18 AM, Thomas Porschberg wrote:
> > Thank you. I am now a step further.
> > I could import data into the new collection with the DIH. However I 
> > observed the following exception 
> > in solr.log:
> >
> > request: 
> > http://127.0.1.1:8983/solr/hugo_shard1_replica1/update?update.distrib=TOLEADER=http%3A%2F%2F127.0.1.1%3A8983%2Fsolr%2Fhugo_shard2_replica1%2F=javabin=2
> > Remote error message: This IndexSchema is not mutable.
> 
> This probably means that the configuration has an update processor that
> adds unknown fields, but is using the classic schema instead of the
> managed schema.  If you want unknown fields to automatically be guessed
> and added, then you need the managed schema.  If not, then remove the
> custom update processor chain.  If this doesn't sound like what's wrong,
> then we will need the entire error message including the full Java
> stacktrace.  That may be in the other instance's solr.log file.

Ok, commenting out the "update processor chain" was a solution. I use classic 
schema.

> 
> > I imagine to split my data per day of the year. My idea was to create 365 
> > shards of type compositeKey.
> 
> You cannot control shard routing explicitly with the compositeId
> router.  That router uses a hash of the uniqueKey field to decide which
> shard gets the document.  As its name implies, the hash can be composite
> -- parts of the hash can be decided by multiple parts of the value in
> the field, but it's still hashed.
> 
> You must use the implicit router (which means all routing is manual) if
> you want to explicitly name the shard that receives the data.

I was now able to create 365 shards with the 'implicit' router.
In the collection-API call I also specified 
router.field=part_crit 
which is the day of the year 1..365
I added this field in my SQL-statement and in schema.xml.

Next step I thought would be to trigger the dataimport.

However I get:

2017-05-18 05:41:37.417 ERROR (Thread-14) [c:hansi s:308 r:core_node76 
x:hansi_308_replica1] o.a.s.h.d.DataImporter Full Import 
failed:java.lang.RuntimeException: org.apache.solr.common.SolrException: No 
registered leader was found after waiting for 4000ms , collection: hansi slice: 
230
at 
org.apache.solr.handler.dataimport.DocBuilder.execute(DocBuilder.java:270)
at 
org.apache.solr.handler.dataimport.DataImporter.doFullImport(DataImporter.java:416)
at 
org.apache.solr.handler.dataimport.DataImporter.runCmd(DataImporter.java:475)
at 
org.apache.solr.handler.dataimport.DataImporter.lambda$runAsync$0(DataImporter.java:458)
at java.lang.Thread.run(Thread.java:745)

when I start the import.

What could be the reason?

Thank you
Thomas