Re: [CDCR]Unable to locate core

2019-05-19 Thread Natarajan, Rajeswari
Thanks Amrith. Created a bug
https://issues.apache.org/jira/browse/SOLR-13481

Regards,
Rajeswari

On 5/19/19, 3:44 PM, "Amrit Sarkar"  wrote:

Sounds legit to me.

Can you create a Jira and list down the problem statement and design
solution there. I am confident it will attract committers' attention and
they can review the design and provide feedback.

Amrit Sarkar
Search Engineer
Lucidworks, Inc.
415-589-9269
www.lucidworks.com
Twitter http://twitter.com/lucidworks
LinkedIn: https://www.linkedin.com/in/sarkaramrit2
Medium: https://medium.com/@sarkaramrit2


On Mon, May 20, 2019 at 3:59 AM Natarajan, Rajeswari <
rajeswari.natara...@sap.com> wrote:

> Thanks Amrith for creating a patch. But the code in the
> LBHttpSolrClient.java needs to be fixed too, if the for loop  to work as
> intended.
> Regards
> Rajeswari
>
> public Rsp request(Req req) throws SolrServerException, IOException {
> Rsp rsp = new Rsp();
> Exception ex = null;
> boolean isNonRetryable = req.request instanceof IsUpdateRequest ||
> ADMIN_PATHS.contains(req.request.getPath());
> List skipped = null;
>
> final Integer numServersToTry = req.getNumServersToTry();
> int numServersTried = 0;
>
> boolean timeAllowedExceeded = false;
> long timeAllowedNano = getTimeAllowedInNanos(req.getRequest());
> long timeOutTime = System.nanoTime() + timeAllowedNano;
> for (String serverStr : req.getServers()) {
>   if (timeAllowedExceeded = isTimeExceeded(timeAllowedNano,
> timeOutTime)) {
> break;
>   }
>
>   serverStr = normalize(serverStr);
>   // if the server is currently a zombie, just skip to the next one
>   ServerWrapper wrapper = zombieServers.get(serverStr);
>   if (wrapper != null) {
> // System.out.println("ZOMBIE SERVER QUERIED: " + serverStr);
> final int numDeadServersToTry = req.getNumDeadServersToTry();
> if (numDeadServersToTry > 0) {
>   if (skipped == null) {
> skipped = new ArrayList<>(numDeadServersToTry);
> skipped.add(wrapper);
>   }
>   else if (skipped.size() < numDeadServersToTry) {
> skipped.add(wrapper);
>   }
> }
> continue;
>   }
>   try {
> MDC.put("LBHttpSolrClient.url", serverStr);
>
> if (numServersToTry != null && numServersTried >
> numServersToTry.intValue()) {
>   break;
> }
>
> HttpSolrClient client = makeSolrClient(serverStr);
>
> ++numServersTried;
> ex = doRequest(client, req, rsp, isNonRetryable, false, null);
> if (ex == null) {
>   return rsp; // SUCCESS
> }
>   } finally {
> MDC.remove("LBHttpSolrClient.url");
>   }
> }
>
> // try the servers we previously skipped
> if (skipped != null) {
>   for (ServerWrapper wrapper : skipped) {
> if (timeAllowedExceeded = isTimeExceeded(timeAllowedNano,
> timeOutTime)) {
>   break;
> }
>
> if (numServersToTry != null && numServersTried >
> numServersToTry.intValue()) {
>   break;
> }
>
> try {
>   MDC.put("LBHttpSolrClient.url", wrapper.client.getBaseURL());
>   ++numServersTried;
>   ex = doRequest(wrapper.client, req, rsp, isNonRetryable, true,
> wrapper.getKey());
>   if (ex == null) {
> return rsp; // SUCCESS
>   }
> } finally {
>   MDC.remove("LBHttpSolrClient.url");
> }
>   }
> }
>
>
> final String solrServerExceptionMessage;
> if (timeAllowedExceeded) {
>   solrServerExceptionMessage = "Time allowed to handle this request
> exceeded";
> } else {
>   if (numServersToTry != null && numServersTried >
> numServersToTry.intValue()) {
> solrServerExceptionMessage = "No live SolrServers available to
> handle this request:"
> + " numServersTried="+numServersTried
> + " numServersToTry="+numServersToTry.intValue();
>   } else {
> solrServerExceptionMessage = "No live SolrServers available to
> handle this request";
>   }
> }
> if (ex == null) {
>   throw new SolrServerException(solrServerExceptionMessage);
> } else {
>   throw new SolrServerException(solrServerExceptionMessage+":" +
> zombieServers.keySet(), ex);
> }
>
>   }
>
> On 5/19/19, 3:12 PM, "Amrit Sarkar"  wrote:
>
> >
> > Thanks Natrajan,
  

Re: [CDCR]Unable to locate core

2019-05-19 Thread Amrit Sarkar
Sounds legit to me.

Can you create a Jira and list down the problem statement and design
solution there. I am confident it will attract committers' attention and
they can review the design and provide feedback.

Amrit Sarkar
Search Engineer
Lucidworks, Inc.
415-589-9269
www.lucidworks.com
Twitter http://twitter.com/lucidworks
LinkedIn: https://www.linkedin.com/in/sarkaramrit2
Medium: https://medium.com/@sarkaramrit2


On Mon, May 20, 2019 at 3:59 AM Natarajan, Rajeswari <
rajeswari.natara...@sap.com> wrote:

> Thanks Amrith for creating a patch. But the code in the
> LBHttpSolrClient.java needs to be fixed too, if the for loop  to work as
> intended.
> Regards
> Rajeswari
>
> public Rsp request(Req req) throws SolrServerException, IOException {
> Rsp rsp = new Rsp();
> Exception ex = null;
> boolean isNonRetryable = req.request instanceof IsUpdateRequest ||
> ADMIN_PATHS.contains(req.request.getPath());
> List skipped = null;
>
> final Integer numServersToTry = req.getNumServersToTry();
> int numServersTried = 0;
>
> boolean timeAllowedExceeded = false;
> long timeAllowedNano = getTimeAllowedInNanos(req.getRequest());
> long timeOutTime = System.nanoTime() + timeAllowedNano;
> for (String serverStr : req.getServers()) {
>   if (timeAllowedExceeded = isTimeExceeded(timeAllowedNano,
> timeOutTime)) {
> break;
>   }
>
>   serverStr = normalize(serverStr);
>   // if the server is currently a zombie, just skip to the next one
>   ServerWrapper wrapper = zombieServers.get(serverStr);
>   if (wrapper != null) {
> // System.out.println("ZOMBIE SERVER QUERIED: " + serverStr);
> final int numDeadServersToTry = req.getNumDeadServersToTry();
> if (numDeadServersToTry > 0) {
>   if (skipped == null) {
> skipped = new ArrayList<>(numDeadServersToTry);
> skipped.add(wrapper);
>   }
>   else if (skipped.size() < numDeadServersToTry) {
> skipped.add(wrapper);
>   }
> }
> continue;
>   }
>   try {
> MDC.put("LBHttpSolrClient.url", serverStr);
>
> if (numServersToTry != null && numServersTried >
> numServersToTry.intValue()) {
>   break;
> }
>
> HttpSolrClient client = makeSolrClient(serverStr);
>
> ++numServersTried;
> ex = doRequest(client, req, rsp, isNonRetryable, false, null);
> if (ex == null) {
>   return rsp; // SUCCESS
> }
>   } finally {
> MDC.remove("LBHttpSolrClient.url");
>   }
> }
>
> // try the servers we previously skipped
> if (skipped != null) {
>   for (ServerWrapper wrapper : skipped) {
> if (timeAllowedExceeded = isTimeExceeded(timeAllowedNano,
> timeOutTime)) {
>   break;
> }
>
> if (numServersToTry != null && numServersTried >
> numServersToTry.intValue()) {
>   break;
> }
>
> try {
>   MDC.put("LBHttpSolrClient.url", wrapper.client.getBaseURL());
>   ++numServersTried;
>   ex = doRequest(wrapper.client, req, rsp, isNonRetryable, true,
> wrapper.getKey());
>   if (ex == null) {
> return rsp; // SUCCESS
>   }
> } finally {
>   MDC.remove("LBHttpSolrClient.url");
> }
>   }
> }
>
>
> final String solrServerExceptionMessage;
> if (timeAllowedExceeded) {
>   solrServerExceptionMessage = "Time allowed to handle this request
> exceeded";
> } else {
>   if (numServersToTry != null && numServersTried >
> numServersToTry.intValue()) {
> solrServerExceptionMessage = "No live SolrServers available to
> handle this request:"
> + " numServersTried="+numServersTried
> + " numServersToTry="+numServersToTry.intValue();
>   } else {
> solrServerExceptionMessage = "No live SolrServers available to
> handle this request";
>   }
> }
> if (ex == null) {
>   throw new SolrServerException(solrServerExceptionMessage);
> } else {
>   throw new SolrServerException(solrServerExceptionMessage+":" +
> zombieServers.keySet(), ex);
> }
>
>   }
>
> On 5/19/19, 3:12 PM, "Amrit Sarkar"  wrote:
>
> >
> > Thanks Natrajan,
> >
> > Solid analysis and I saw the issue being reported by multiple users
> in
> > past few months and unfortunately I baked an incomplete code.
> >
> > I think the correct way of solving this issue is to identify the
> correct
> > base-url for the respective core we need to trigger REQUESTRECOVERY
> to and
> > create a local HttpSolrClient instead of using CloudSolrClient from
> > CdcrReplicatorState. This will avoid unnecessary retry which will be
> > redundant in our case.
> >
> > I baked a small patch few weeks back and will upload it on the
> SOLR-11724
> > .
> >
>
>
>


Re: [CDCR]Unable to locate core

2019-05-19 Thread Natarajan, Rajeswari
Thanks Amrith for creating a patch. But the code in the LBHttpSolrClient.java 
needs to be fixed too, if the for loop  to work as intended.
Regards
Rajeswari

public Rsp request(Req req) throws SolrServerException, IOException {
Rsp rsp = new Rsp();
Exception ex = null;
boolean isNonRetryable = req.request instanceof IsUpdateRequest || 
ADMIN_PATHS.contains(req.request.getPath());
List skipped = null;

final Integer numServersToTry = req.getNumServersToTry();
int numServersTried = 0;

boolean timeAllowedExceeded = false;
long timeAllowedNano = getTimeAllowedInNanos(req.getRequest());
long timeOutTime = System.nanoTime() + timeAllowedNano;
for (String serverStr : req.getServers()) {
  if (timeAllowedExceeded = isTimeExceeded(timeAllowedNano, timeOutTime)) {
break;
  }
  
  serverStr = normalize(serverStr);
  // if the server is currently a zombie, just skip to the next one
  ServerWrapper wrapper = zombieServers.get(serverStr);
  if (wrapper != null) {
// System.out.println("ZOMBIE SERVER QUERIED: " + serverStr);
final int numDeadServersToTry = req.getNumDeadServersToTry();
if (numDeadServersToTry > 0) {
  if (skipped == null) {
skipped = new ArrayList<>(numDeadServersToTry);
skipped.add(wrapper);
  }
  else if (skipped.size() < numDeadServersToTry) {
skipped.add(wrapper);
  }
}
continue;
  }
  try {
MDC.put("LBHttpSolrClient.url", serverStr);

if (numServersToTry != null && numServersTried > 
numServersToTry.intValue()) {
  break;
}

HttpSolrClient client = makeSolrClient(serverStr);

++numServersTried;
ex = doRequest(client, req, rsp, isNonRetryable, false, null);
if (ex == null) {
  return rsp; // SUCCESS
}
  } finally {
MDC.remove("LBHttpSolrClient.url");
  }
}

// try the servers we previously skipped
if (skipped != null) {
  for (ServerWrapper wrapper : skipped) {
if (timeAllowedExceeded = isTimeExceeded(timeAllowedNano, timeOutTime)) 
{
  break;
}

if (numServersToTry != null && numServersTried > 
numServersToTry.intValue()) {
  break;
}

try {
  MDC.put("LBHttpSolrClient.url", wrapper.client.getBaseURL());
  ++numServersTried;
  ex = doRequest(wrapper.client, req, rsp, isNonRetryable, true, 
wrapper.getKey());
  if (ex == null) {
return rsp; // SUCCESS
  }
} finally {
  MDC.remove("LBHttpSolrClient.url");
}
  }
}


final String solrServerExceptionMessage;
if (timeAllowedExceeded) {
  solrServerExceptionMessage = "Time allowed to handle this request 
exceeded";
} else {
  if (numServersToTry != null && numServersTried > 
numServersToTry.intValue()) {
solrServerExceptionMessage = "No live SolrServers available to handle 
this request:"
+ " numServersTried="+numServersTried
+ " numServersToTry="+numServersToTry.intValue();
  } else {
solrServerExceptionMessage = "No live SolrServers available to handle 
this request";
  }
}
if (ex == null) {
  throw new SolrServerException(solrServerExceptionMessage);
} else {
  throw new SolrServerException(solrServerExceptionMessage+":" + 
zombieServers.keySet(), ex);
}

  }

On 5/19/19, 3:12 PM, "Amrit Sarkar"  wrote:

>
> Thanks Natrajan,
>
> Solid analysis and I saw the issue being reported by multiple users in
> past few months and unfortunately I baked an incomplete code.
>
> I think the correct way of solving this issue is to identify the correct
> base-url for the respective core we need to trigger REQUESTRECOVERY to and
> create a local HttpSolrClient instead of using CloudSolrClient from
> CdcrReplicatorState. This will avoid unnecessary retry which will be
> redundant in our case.
>
> I baked a small patch few weeks back and will upload it on the SOLR-11724
> .
>




Re: CDCR one source multiple targets

2019-05-19 Thread Amrit Sarkar
Thanks, Arnold,

Is the documentation not clear with the manner multiple CDCR targets can be
configured?

Amrit Sarkar
Search Engineer
Lucidworks, Inc.
415-589-9269
www.lucidworks.com
Twitter http://twitter.com/lucidworks
LinkedIn: https://www.linkedin.com/in/sarkaramrit2
Medium: https://medium.com/@sarkaramrit2


On Thu, Apr 11, 2019 at 2:59 AM Arnold Bronley 
wrote:

> This had a very simple solution if anybody else is wondering about the same
> issue.I had to define separate replica elements inside cdcr. Following is
> an example.
>
>   "replica"> target1:2181  techproducts techproducts str>   target2:2181  techproducts
>  name="target">techproducts"threadPoolSize">8 1000  "batchSize">128"schedule">1000name="buffer"> disabled   requestHandler>
>
> On Thu, Mar 21, 2019 at 10:40 AM Arnold Bronley 
> wrote:
>
> > I see a similar question asked but no answers there too.
> >
> http://lucene.472066.n3.nabble.com/CDCR-Replication-from-one-source-to-multiple-targets-td4308717.html
> > OP there is using multiple cdcr request handlers but in my case I am
> using
> > multiple zkhost strings. It will be pretty limiting if we cannot use cdcr
> > for one source- multiple target cluster situation.
> > Can somebody please confirm whether this is even supported?
> >
> >
> > On Wed, Mar 20, 2019 at 1:12 PM Arnold Bronley 
> > wrote:
> >
> >> Hi,
> >>
> >> is it possible to use CDCR with one source SolrCloud cluster and
> multiple
> >> target SolrCloud clusters? I tried to edit the zkHost setting in source
> >> cluster's solrconfig file by adding multiple comma separated values for
> >> target zkhosts for multuple target clusters. But the CDCR replication
> >> happens only to one of the zkhosts and not all. If this is not supported
> >> then how should I go about implementing something like this?
> >>
> >>
> >
>


Re: CDCR - shards not in sync

2019-05-19 Thread Amrit Sarkar
Hi Jay,

Can you look at the logs and identify if there are any exceptions occurring
at particular Solr nodes the lagging shard is hosted?

Amrit Sarkar
Search Engineer
Lucidworks, Inc.
415-589-9269
www.lucidworks.com
Twitter http://twitter.com/lucidworks
LinkedIn: https://www.linkedin.com/in/sarkaramrit2
Medium: https://medium.com/@sarkaramrit2


On Mon, Apr 15, 2019 at 8:33 PM Jay Potharaju  wrote:

> Hi,
> I have a collection with 8 shards. 6 out of the shards are in sync but the
> other 2 are lagging behind by more than 10 plus hours. The tlog is only 0.5
> GB in size. I have tried stopping and starting CDCR number of times but it
> has not helped.
> From what i have noticed there is always a shard that is slower than
> others.
>
> Solr version: 7.7.0
> CDCR config
>
>   
> 2
> 10
> 4500
>   
>
>   
> 6
>   
>
>
> Thanks
> Jay
>


[CDCR]Unable to locate core

2019-05-19 Thread Amrit Sarkar
>
> Thanks Natrajan,
>
> Solid analysis and I saw the issue being reported by multiple users in
> past few months and unfortunately I baked an incomplete code.
>
> I think the correct way of solving this issue is to identify the correct
> base-url for the respective core we need to trigger REQUESTRECOVERY to and
> create a local HttpSolrClient instead of using CloudSolrClient from
> CdcrReplicatorState. This will avoid unnecessary retry which will be
> redundant in our case.
>
> I baked a small patch few weeks back and will upload it on the SOLR-11724
> .
>


Re: [CDCR]Unable to locate core

2019-05-19 Thread Natarajan, Rajeswari
Here is my close analysis:


SolrClient request goes to the below method  "request " in the class 
LBHttpSolrClient.java
There is a for loop to try  different live servers , but when  doRequest method 
 (in the request method below) sends exception there is no catch , so next 
re-try is not done. To solve this issue , there should be catch around 
doRequest and then the second time it will re-try the correct request. But in 
case there are multiple live servers, the request might timeout also.  This 
needs to be fixed to make CDCR bootstrap  work reliable. If not sometimes it 
will work good and sometimes not. I can work on this patch  if this is agreed.


public Rsp request(Req req) throws SolrServerException, IOException {
Rsp rsp = new Rsp();
Exception ex = null;
boolean isNonRetryable = req.request instanceof IsUpdateRequest || 
ADMIN_PATHS.contains(req.request.getPath());
List skipped = null;

final Integer numServersToTry = req.getNumServersToTry();
int numServersTried = 0;

boolean timeAllowedExceeded = false;
long timeAllowedNano = getTimeAllowedInNanos(req.getRequest());
long timeOutTime = System.nanoTime() + timeAllowedNano;
for (String serverStr : req.getServers()) {
  if (timeAllowedExceeded = isTimeExceeded(timeAllowedNano, timeOutTime)) {
break;
  }
  
  serverStr = normalize(serverStr);
  // if the server is currently a zombie, just skip to the next one
  ServerWrapper wrapper = zombieServers.get(serverStr);
  if (wrapper != null) {
// System.out.println("ZOMBIE SERVER QUERIED: " + serverStr);
final int numDeadServersToTry = req.getNumDeadServersToTry();
if (numDeadServersToTry > 0) {
  if (skipped == null) {
skipped = new ArrayList<>(numDeadServersToTry);
skipped.add(wrapper);
  }
  else if (skipped.size() < numDeadServersToTry) {
skipped.add(wrapper);
  }
}
continue;
  }
  try {
MDC.put("LBHttpSolrClient.url", serverStr);

if (numServersToTry != null && numServersTried > 
numServersToTry.intValue()) {
  break;
} 

HttpSolrClient client = makeSolrClient(serverStr);

++numServersTried;
ex = doRequest(client, req, rsp, isNonRetryable, false, null);
if (ex == null) {
  return rsp; // SUCCESS
}
   //NO CATCH HERE ,  SO IT FAILS
  } finally {
MDC.remove("LBHttpSolrClient.url");
  }
}

// try the servers we previously skipped
if (skipped != null) {
  for (ServerWrapper wrapper : skipped) {
if (timeAllowedExceeded = isTimeExceeded(timeAllowedNano, timeOutTime)) 
{
  break;
}

if (numServersToTry != null && numServersTried > 
numServersToTry.intValue()) {
  break;
}

try {
  MDC.put("LBHttpSolrClient.url", wrapper.client.getBaseURL());
  ++numServersTried;
  ex = doRequest(wrapper.client, req, rsp, isNonRetryable, true, 
wrapper.getKey());
  if (ex == null) {
return rsp; // SUCCESS
  }
} finally {
  MDC.remove("LBHttpSolrClient.url");
}
  }
}


final String solrServerExceptionMessage;
if (timeAllowedExceeded) {
  solrServerExceptionMessage = "Time allowed to handle this request 
exceeded";
} else {
  if (numServersToTry != null && numServersTried > 
numServersToTry.intValue()) {
solrServerExceptionMessage = "No live SolrServers available to handle 
this request:"
+ " numServersTried="+numServersTried
+ " numServersToTry="+numServersToTry.intValue();
  } else {
solrServerExceptionMessage = "No live SolrServers available to handle 
this request";
  }
}
if (ex == null) {
  throw new SolrServerException(solrServerExceptionMessage);
} else {
  throw new SolrServerException(solrServerExceptionMessage+":" + 
zombieServers.keySet(), ex);
}

  }


Thanks,
Rajeswari


On 5/19/19, 9:39 AM, "Natarajan, Rajeswari"  
wrote:

Hi

We are using solr 7.6 and trying out bidirectional CDCR and I also hit this 
issue. 

Stacktrace

INFO  (cdcr-bootstrap-status-17-thread-1) [   ] 
o.a.s.h.CdcrReplicatorManager CDCR bootstrap successful in 3 seconds
   
INFO  (cdcr-bootstrap-status-17-thread-1) [   ] 
o.a.s.h.CdcrReplicatorManager Create new update log reader for target abcd_ta 
with checkpoint -1 @ abcd_ta:shard1
ERROR (cdcr-bootstrap-status-17-thread-1) [   ] 
o.a.s.h.CdcrReplicatorManager Unable to bootstrap the target collection abcd_ta 
shard: shard1 
olrj.impl.HttpSolrClient$RemoteSolrException: Error from server at 
http://10.169.50.182:8983/solr: Unable to locate core 
kanna_ta_shard1_r

Re: Solr8.0.0 Performance Test

2019-05-19 Thread Shawn Heisey

On 5/19/2019 12:20 AM, Kayak28 wrote:

Hello, Apache Solr community members:

I have a few questions about the load test of Solr8.

- for Solr8, optimization command merge segment to 2, but not 1.
Is that ok behavior?


Since version 7.5, optimize with TieredMergePolicy (the default policy) 
respects the maximum segment size, which defaults to 5GB.  You can 
explicitly tell the optimize to make one segment, and then that's what 
it will do.  That can be problematic with indexes that have a lot of 
deletes, which is why newer versions respect the max segment size.


I would not expect to see a really noticeable performance increase by 
going from two segments to one.  It's probably not a good idea to 
interfere with Solr's settings there.


- in a certain situation (explained below), Solr8 (without the use of 
http/2 and block-max WAND algorithm) is faster then Solr7.4.0. What are 
the considerable causes of this performance improvement? Or did I plan 
load-test badly?


It is always the goal with new Solr versions to have better performance 
than previous versions.  Sometimes that happens, sometimes it doesn't. 
On occasion mistakes have been made that go the wrong direction ... we 
correct those problems as soon as we find out about them.


Your PDF attachment did not make it to the list.  We cannot see it.  The 
mailing list rarely lets attachments through.


Thanks,
Shawn


Re: Graph query extremely slow

2019-05-19 Thread Rahul Goswami
Hello experts,

Just following up in case my previous email got lost in the big stack of
queries. Would appreciate any help on optimizing a graph query. Or any
pointers on  the direction to investigate.

Thanks,
Rahul

On Wed, May 15, 2019 at 9:37 PM Rahul Goswami  wrote:

> Hello,
>
> I am running Solr 7.2.1 in standalone mode with 8GB heap. I have an index
> with ~4 million documents. Not too big. I am using a graph query parser to
> filter out some documents as below:
>
> fq={!graph from=from_field to=to_field returnRoot=false}
>
> Both from_field and to_field are indexed and of type string. This is part
> of a bigger query which is taking around 65 seconds to execute. Executing
> _only_ the graph filter query takes about 64.5 seconds. The total number of
> documents from this filter query is a little over 1 million.
>
> Is this performance expected out of graph query ? Any optimizations that I
> could try?
>
>
> Thanks,
> Rahul
>


Re: join the Solr mailing

2019-05-19 Thread Erick Erickson
It’s all a self-registration process. If you followed the instructions for 
subscribing here: 
http://lucene.apache.org/solr/community.html#mailing-lists-irc you should 
already have an answer ;)

Best,
Erick

> On May 19, 2019, at 12:19 AM, Vadim Karagichev  
> wrote:
> 
> Hi,
> 
> A fellow coworker is subscribed to the mail distribution of Solr questions,
> Wanted to ask if I could join as well?
> 
> 
> Thanks
> 
> 
> This email and any attachments thereto may contain private, confidential, and 
> privileged material for the sole use of the intended recipient. Any review, 
> copying, or distribution of this email (or any attachments thereto) by others 
> is strictly prohibited. If you are not the intended recipient, please contact 
> the sender immediately and permanently delete the original and any copies of 
> this email and any attachments thereto.



Re: [CDCR]Unable to locate core

2019-05-19 Thread Natarajan, Rajeswari
Hi

We are using solr 7.6 and trying out bidirectional CDCR and I also hit this 
issue. 

Stacktrace

INFO  (cdcr-bootstrap-status-17-thread-1) [   ] o.a.s.h.CdcrReplicatorManager 
CDCR bootstrap successful in 3 seconds  
 
INFO  (cdcr-bootstrap-status-17-thread-1) [   ] o.a.s.h.CdcrReplicatorManager 
Create new update log reader for target abcd_ta with checkpoint -1 @ 
abcd_ta:shard1
ERROR (cdcr-bootstrap-status-17-thread-1) [   ] o.a.s.h.CdcrReplicatorManager 
Unable to bootstrap the target collection abcd_ta shard: shard1 

olrj.impl.HttpSolrClient$RemoteSolrException: Error from server at 
http://10.169.50.182:8983/solr: Unable to locate core 
kanna_ta_shard1_replica_n1
lr.client.solrj.impl.HttpSolrClient.executeMethod(HttpSolrClient.java:643) 
~[solr-solrj-7.6.0.jar:7.6.0 719cde97f84640faa1e3525690d262946571245f - nknize 
- 2018-12-07 14:47:53]
lr.client.solrj.impl.HttpSolrClient.request(HttpSolrClient.java:255) 
~[solr-solrj-7.6.0.jar:7.6.0 719cde97f84640faa1e3525690d262946571245f - nknize 
- 2018-12-07 14:47:53] 
lr.client.solrj.impl.HttpSolrClient.request(HttpSolrClient.java:244) 
~[solr-solrj-7.6.0.jar:7.6.0 719cde97f84640faa1e3525690d262946571245f - nknize 
- 2018-12-07 14:47:53]
lr.client.solrj.impl.LBHttpSolrClient.doRequest(LBHttpSolrClient.java:483) 
~[solr-solrj-7.6.0.jar:7.6.0 719cde97f84640faa1e3525690d262946571245f - nknize 
- 2018-12-07 14:47:53]
lr.client.solrj.impl.LBHttpSolrClient.request(LBHttpSolrClient.java:413) 
~[solr-solrj-7.6.0.jar:7.6.0 719cde97f84640faa1e3525690d262946571245f - nknize 
- 2018-12-07 14:47:53]
lr.client.solrj.impl.CloudSolrClient.sendRequest(CloudSolrClient.java:1107) 
~[solr-solrj-7.6.0.jar:7.6.0 719cde97f84640faa1e3525690d262946571245f - nknize 
- 2018-12-07 14:47:53]
lr.client.solrj.impl.CloudSolrClient.requestWithRetryOnStaleState(CloudSolrClient.java:884)
 ~[solr-solrj-7.6.0.jar:7.6.0 719cde97f84640faa1e3525690d262946571245f - nknize 
- 2018-12-07 14:47:53]


I stepped through the code

private NamedList sendRequestRecoveryToFollower(SolrClient client, String 
coreName) throws SolrServerException, IOException {
CoreAdminRequest.RequestRecovery recoverRequestCmd = new 
CoreAdminRequest.RequestRecovery();

recoverRequestCmd.setAction(CoreAdminParams.CoreAdminAction.REQUESTRECOVERY);
recoverRequestCmd.setCoreName(coreName);
return client.request(recoverRequestCmd);
  }

 In the above method , recovery request command is admin command and it is 
specific to a core. In the  solrclient.request logic the code gets the 
liveservers and execute the command in a loop ,but  since this is admin command 
this is non re-triable.  Depending on which live server the code gets and where 
does the core lies , the recover request command might be successful or 
failure.  So I think there is problem with this code in trying to send the core 
command to all available live servers , the code I guess should find the 
correct server on which the core lies and send this request.

Regards,
Rajeswari

On 5/15/19, 10:59 AM, "Natarajan, Rajeswari"  
wrote:

I am also facing this issue. Any resolution found on this issue, Please 
update. Thanks

On 2/7/19, 10:42 AM, "Tim"  wrote:

So it looks like I'm having an issue with this fix:
https://issues.apache.org/jira/browse/SOLR-11724

So I've messed around with this for a while and every time the leader to
leader replica portion works fine. But the Recovery portion 
(implemented as
part of the fix above) fails. 

I've run a few tests and every time the recovery portion kicks off, it 
sends
the recovery command to the node which has the leader for a given 
replica
instead of the follower. 
I've recreated the collection several times so that replicas are on
different nodes with the same results each time. It seems to be assumed 
that
the follower is on the same solr node as the leader. 
 
For example, if s3r10 (shard 3, replica 10) is the leader and is on 
node1,
while the follower s3r8 is on node2, then the core recovery command 
meant
for s3r8 is being sent to node1 instead of node2.





--
Sent from: http://lucene.472066.n3.nabble.com/Solr-User-f472068.html






Re: Problem with SolrJ and indexing PDF files

2019-05-19 Thread Erick Erickson
Here’s a skeletal program to get you started using Tika directly in a SolrJ 
client, with a long explication of why using Solr’s extracting request handler 
is probably not what you want to do in production: 

https://lucidworks.com/2012/02/14/indexing-with-solrj/

SolrServer was renamed SolrClient 4 1/2 years ago, one of my pet peeves is that 
lots of pages don’t have dates attached. The link above was updated after this 
change even though it was published in 2012, but even so you’ll find some 
methods that have since been deprecated.

If you’re using SolrCloud, you should be using CloudSolrClient rather than 
SolrClient.

Best,
Erick

> On May 19, 2019, at 5:07 AM, Jörn Franke  wrote:
> 
> You can use the Tika library to parse the PDFs and then post the text to the 
> Solr servers
> 
>> Am 19.05.2019 um 11:02 schrieb Mareike Glock 
>> :
>> 
>> Dear Solr Team,
>> 
>> I am trying to index Word and PDF documents with Solr using SolrJ, but most 
>> of the examples I found on the internet use the SolrServer class which I 
>> guess is deprecated. 
>> The connection to Solr itself is working, because I can add 
>> SolrInputDocuments to the index but it does not work for rich documents 
>> because I get an exception.
>> 
>> 
>> public static void main(String[] args) throws IOException, 
>> SolrServerException {
>>String urlString = "http://localhost:8983/solr/localDocs16";;
>>HttpSolrClient solr = new HttpSolrClient.Builder(urlString).build();
>> 
>>//is working
>>for(int i=0;i<1000;++i) {
>>SolrInputDocument doc = new SolrInputDocument();
>>doc.addField("cat", "book");
>>doc.addField("id", "book-" + i);
>>doc.addField("name", "The Legend of the Hobbit part " + i);
>>solr.add(doc);
>>if(i%100==0) solr.commit();  // periodically flush
>>}
>> 
>>//is not working
>>File file = new File("path\\testfile.pdf");
>> 
>>ContentStreamUpdateRequest req = new 
>> ContentStreamUpdateRequest("update/extract");
>> 
>>req.addFile(file, "application/pdf");
>>req.setParam("literal.id", "doc1");
>>req.setAction(AbstractUpdateRequest.ACTION.COMMIT, true, true);
>>try{
>>solr.request(req);
>>}
>>catch(IOException e){
>>PrintWriter out = new 
>> PrintWriter("C:\\Users\\mareike\\Desktop\\filename.txt");
>>e.printStackTrace(out);
>>out.close();
>>System.out.println("IO message: " + e.getMessage());
>>} catch(SolrServerException e){
>>PrintWriter out = new 
>> PrintWriter("C:\\Users\\mareike\\Desktop\\filename.txt");
>>e.printStackTrace(out);
>>out.close();
>>System.out.println("SolrServer message: " + e.getMessage());
>>} catch(Exception e){
>>PrintWriter out = new 
>> PrintWriter("C:\\Users\\mareike\\Desktop\\filename.txt");
>>e.printStackTrace(out);
>>out.close();
>>System.out.println("UnknownException message: " + e.getMessage());
>>}finally{
>>solr.commit();
>>}
>> }
>> 
>> 
>> I am using Maven (pom.xml attached) and created a JAR file, which I then 
>> tried to execute from the command line, and this is the output I get:
>> 
>>SLF4J: Failed to load class "org.slf4j.impl.StaticLoggerBinder".
>>SLF4J: Defaulting to no-operation (NOP) logger implementation
>>SLF4J: See http://www.slf4j.org/codes.html#StaticLoggerBinder for further 
>> details.
>>SLF4J: Failed to load class "org.slf4j.impl.StaticMDCBinder".
>>SLF4J: Defaulting to no-operation MDCAdapter implementation.
>>SLF4J: See http://www.slf4j.org/codes.html#no_static_mdc_binder for 
>> further details.
>>message: UnknownException message: Error from server at 
>> http://localhost:8983/solr/localDocs17: Bad contentType for search handler 
>> :application/pdf request={wt=javabin&version=2}
>> 
>> 
>> 
>> 
>> 
>> I hope you may be able to help me with this. I also posted this issue on 
>> Github.
>> 
>> Cheers,
>> Mareike Glock
>> 
>> 



Re: Problem with SolrJ and indexing PDF files

2019-05-19 Thread Jörn Franke
You can use the Tika library to parse the PDFs and then post the text to the 
Solr servers

> Am 19.05.2019 um 11:02 schrieb Mareike Glock 
> :
> 
> Dear Solr Team,
> 
> I am trying to index Word and PDF documents with Solr using SolrJ, but most 
> of the examples I found on the internet use the SolrServer class which I 
> guess is deprecated. 
> The connection to Solr itself is working, because I can add 
> SolrInputDocuments to the index but it does not work for rich documents 
> because I get an exception.
> 
> 
> public static void main(String[] args) throws IOException, 
> SolrServerException {
> String urlString = "http://localhost:8983/solr/localDocs16";;
> HttpSolrClient solr = new HttpSolrClient.Builder(urlString).build();
> 
> //is working
> for(int i=0;i<1000;++i) {
> SolrInputDocument doc = new SolrInputDocument();
> doc.addField("cat", "book");
> doc.addField("id", "book-" + i);
> doc.addField("name", "The Legend of the Hobbit part " + i);
> solr.add(doc);
> if(i%100==0) solr.commit();  // periodically flush
> }
> 
> //is not working
> File file = new File("path\\testfile.pdf");
> 
> ContentStreamUpdateRequest req = new 
> ContentStreamUpdateRequest("update/extract");
> 
> req.addFile(file, "application/pdf");
> req.setParam("literal.id", "doc1");
> req.setAction(AbstractUpdateRequest.ACTION.COMMIT, true, true);
> try{
> solr.request(req);
> }
> catch(IOException e){
> PrintWriter out = new 
> PrintWriter("C:\\Users\\mareike\\Desktop\\filename.txt");
> e.printStackTrace(out);
> out.close();
> System.out.println("IO message: " + e.getMessage());
> } catch(SolrServerException e){
> PrintWriter out = new 
> PrintWriter("C:\\Users\\mareike\\Desktop\\filename.txt");
> e.printStackTrace(out);
> out.close();
> System.out.println("SolrServer message: " + e.getMessage());
> } catch(Exception e){
> PrintWriter out = new 
> PrintWriter("C:\\Users\\mareike\\Desktop\\filename.txt");
> e.printStackTrace(out);
> out.close();
> System.out.println("UnknownException message: " + e.getMessage());
> }finally{
> solr.commit();
> }
> }
> 
> 
> I am using Maven (pom.xml attached) and created a JAR file, which I then 
> tried to execute from the command line, and this is the output I get:
> 
> SLF4J: Failed to load class "org.slf4j.impl.StaticLoggerBinder".
> SLF4J: Defaulting to no-operation (NOP) logger implementation
> SLF4J: See http://www.slf4j.org/codes.html#StaticLoggerBinder for further 
> details.
> SLF4J: Failed to load class "org.slf4j.impl.StaticMDCBinder".
> SLF4J: Defaulting to no-operation MDCAdapter implementation.
> SLF4J: See http://www.slf4j.org/codes.html#no_static_mdc_binder for 
> further details.
> message: UnknownException message: Error from server at 
> http://localhost:8983/solr/localDocs17: Bad contentType for search handler 
> :application/pdf request={wt=javabin&version=2}
> 
> 
> 
> 
> 
> I hope you may be able to help me with this. I also posted this issue on 
> Github.
> 
> Cheers,
> Mareike Glock
> 
> 


Problem with SolrJ and indexing PDF files

2019-05-19 Thread Mareike Glock

Dear Solr Team,

I am trying to index Word and PDF documents with Solr using SolrJ, but 
most of the examples I found on the internet use the SolrServer class 
which I guess is deprecated.
The connection to Solr itself is working, because I can add 
SolrInputDocuments to the index but it does not work for rich documents 
because I get an exception.



public static void main(String[] args) throws IOException, 
SolrServerException {

String urlString = "http://localhost:8983/solr/localDocs16";;
HttpSolrClient solr = new 
HttpSolrClient.Builder(urlString).build();


//is working
for(int i=0;i<1000;++i) {
SolrInputDocument doc = new SolrInputDocument();
doc.addField("cat", "book");
doc.addField("id", "book-" + i);
doc.addField("name", "The Legend of the Hobbit part " + i);
solr.add(doc);
if(i%100==0) solr.commit();  // periodically flush
}

//is not working
File file = new File("path\\testfile.pdf");

ContentStreamUpdateRequest req = new 
ContentStreamUpdateRequest("update/extract");


req.addFile(file, "application/pdf");
req.setParam("literal.id", "doc1");
req.setAction(AbstractUpdateRequest.ACTION.COMMIT, true, true);
try{
solr.request(req);
}
catch(IOException e){
PrintWriter out = new 
PrintWriter("C:\\Users\\mareike\\Desktop\\filename.txt");

e.printStackTrace(out);
out.close();
System.out.println("IO message: " + e.getMessage());
} catch(SolrServerException e){
PrintWriter out = new 
PrintWriter("C:\\Users\\mareike\\Desktop\\filename.txt");

e.printStackTrace(out);
out.close();
System.out.println("SolrServer message: " + e.getMessage());
} catch(Exception e){
PrintWriter out = new 
PrintWriter("C:\\Users\\mareike\\Desktop\\filename.txt");

e.printStackTrace(out);
out.close();
System.out.println("UnknownException message: " + 
e.getMessage());

}finally{
solr.commit();
}
}


I am using Maven (pom.xml attached) and created a JAR file, which I then 
tried to execute from the command line, and this is the output I get:


SLF4J: Failed to load class "org.slf4j.impl.StaticLoggerBinder".
SLF4J: Defaulting to no-operation (NOP) logger implementation
SLF4J: See http://www.slf4j.org/codes.html#StaticLoggerBinder for 
further details.

SLF4J: Failed to load class "org.slf4j.impl.StaticMDCBinder".
SLF4J: Defaulting to no-operation MDCAdapter implementation.
SLF4J: See http://www.slf4j.org/codes.html#no_static_mdc_binder for 
further details.
message: *UnknownException message: Error from server at 
http://localhost:8983/solr/localDocs17: Bad contentType for search 
handler :application/pdf request={wt=javabin&version=2}*




I hope you may be able to help me with this. I also posted this issue on 
Github 
.


Cheers,
Mareike Glock

http://maven.apache.org/POM/4.0.0"; xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance";
  xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/maven-v4_0_0.xsd";>
  4.0.0
  com.mycompany.app
  solr-search
  jar
  1.0
  solr-search
  http://maven.apache.org

  
1.7
1.7
  

  

  
maven-assembly-plugin

  

  com.mycompany.app.Main

  
  
jar-with-dependencies
  

  

  

  

  junit
  junit
  3.8.1
  test


  org.apache.solr
  solr-solrj
  7.7.0

  




Re: Solr8.0.0 Performance Test

2019-05-19 Thread Kayak28
Hello, Cao Mạnh Đạt and Community Members:

Thank you for your response and asking for missing information.
I am answering as the following.

> Can you tell more about your setup?
1. index Wikipedia Data by Data Import Handler.
2. run Solr on the server described with the previous email.
3. run JMeter from Mac laptop[MacBook Pro 2018, Processor 2.3 GHz Intel
Core i5, Memory 16GB)
4. execute JMeter plan with thread number 20, ramp-up 0, loop-count 1000.
5. get the result as the PDF file.

(I am not sure if these steps are what you are asking for)

> Are you using SolrCloud (with how many
shards and replicas)?
Both Solrs, Solr8.0.0 and Solr7.4.0, are standalone.
So, intercommunication with HTTP/2 is, for sure, not affected.

If you need more information, I would love to give you.
So, could you please let me know?

Sincerely,
Kaya Ota



2019年5月19日(日) 16:23 Đạt Cao Mạnh :

> Hi Kaya,
>
> Can you tell more about your setup? Are you using SolrCloud (with how many
> shards and replicas)?
> Since the inter communications between nodes are using HTTP/2 as default
> now.
>
> On Sun, 19 May 2019 at 07:29, Kayak28  wrote:
>
> > Hello, Apache Solr community members:
> >
> > I have a few questions about the load test of Solr8.
> >
> > - for Solr8, optimization command merge segment to 2, but not 1.
> > Is that ok behavior?
> > When indexing Wikipedia data, Solr8 generated multiple segments.
> > So, I executed  command from the Admin UI.
> > Solr8 did reduce the number of segments, but it left two segments.
> > Hence, I wonder if it is ok behavior or it is weird.
> >
> >
> >
> > - in a certain situation (explained below), Solr8 (without the use of
> > http/2 and block-max WAND algorithm) is faster then Solr7.4.0. What are
> the
> > considerable causes of this performance improvement? Or did I plan
> > load-test badly?
> >
> > Here is the story I came up with these questions.
> > I performed a simple load test on Solr8 to observe the difference in
> > performance.
> > So, I wondered how fast it became, comparing to Solr7.4.0, which is the
> > version I currently use.
> >
> > My testing environment is below:
> > OS: Ubuntu 16.04
> > Vendor: DELL PowerEdge T410
> > CPU:Intel(R) Xeon(R) CPU E5620 @2.40 GHz 8 Core
> > Memory: 16GB
> > Hard Disk: 3.5 Inch SATA (7,200 rpm): 500 GB
> >
> > The data is from the Japanese Wikipedia dump.
> >
> > By indexing them, both versions of Solrs store 2'366'754 documents, which
> > the index size and JVM memory are 8.48 GB and 8GB accordingly.
> >
> > In order to perform several times of load-tests, only fieldValueCache and
> > fieldCache are working; other Solr's caches are turned off.
> >
> > I use Jmeter(5.1.1) to measure average response time and throughput.
> > I know Jmeter only sends HTTP/1 requests, without a plugin. (and I did
> not
> > use the plugin)
> > So, this result should not be affected by HTTP/2.
> >
> > Also, according to a JIRA (
> > https://issues.apache.org/jira/browse/SOLR-13289 ).
> > Solr8 has not supported block-max WAND algorithm yet, so again this
> result
> > should not be affected by the algorithm, which makes Lucene faster.
> >
> > The results from Jmeter is attached as a PDF file.
> >
> > According to these results, Solr8 is somehow superior then Solr7.4.0.
> >
> > But, I have no idea what are the considerable causes of this difference.
> > Does anyone have any idea about this?
> >
> >
> > Sincerely,
> > Kaya Ota
> >
> --
> *Best regards,*
> *Cao Mạnh Đạt*
>
>
> *D.O.B : 31-07-1991Cell: (+84) 946.328.329E-mail: caomanhdat...@gmail.com
> *
>


join the Solr mailing

2019-05-19 Thread Vadim Karagichev
Hi,

A fellow coworker is subscribed to the mail distribution of Solr questions,
Wanted to ask if I could join as well?


Thanks


This email and any attachments thereto may contain private, confidential, and 
privileged material for the sole use of the intended recipient. Any review, 
copying, or distribution of this email (or any attachments thereto) by others 
is strictly prohibited. If you are not the intended recipient, please contact 
the sender immediately and permanently delete the original and any copies of 
this email and any attachments thereto.


Re: Solr8.0.0 Performance Test

2019-05-19 Thread Đạt Cao Mạnh
Hi Kaya,

Can you tell more about your setup? Are you using SolrCloud (with how many
shards and replicas)?
Since the inter communications between nodes are using HTTP/2 as default
now.

On Sun, 19 May 2019 at 07:29, Kayak28  wrote:

> Hello, Apache Solr community members:
>
> I have a few questions about the load test of Solr8.
>
> - for Solr8, optimization command merge segment to 2, but not 1.
> Is that ok behavior?
> When indexing Wikipedia data, Solr8 generated multiple segments.
> So, I executed  command from the Admin UI.
> Solr8 did reduce the number of segments, but it left two segments.
> Hence, I wonder if it is ok behavior or it is weird.
>
>
>
> - in a certain situation (explained below), Solr8 (without the use of
> http/2 and block-max WAND algorithm) is faster then Solr7.4.0. What are the
> considerable causes of this performance improvement? Or did I plan
> load-test badly?
>
> Here is the story I came up with these questions.
> I performed a simple load test on Solr8 to observe the difference in
> performance.
> So, I wondered how fast it became, comparing to Solr7.4.0, which is the
> version I currently use.
>
> My testing environment is below:
> OS: Ubuntu 16.04
> Vendor: DELL PowerEdge T410
> CPU:Intel(R) Xeon(R) CPU E5620 @2.40 GHz 8 Core
> Memory: 16GB
> Hard Disk: 3.5 Inch SATA (7,200 rpm): 500 GB
>
> The data is from the Japanese Wikipedia dump.
>
> By indexing them, both versions of Solrs store 2'366'754 documents, which
> the index size and JVM memory are 8.48 GB and 8GB accordingly.
>
> In order to perform several times of load-tests, only fieldValueCache and
> fieldCache are working; other Solr's caches are turned off.
>
> I use Jmeter(5.1.1) to measure average response time and throughput.
> I know Jmeter only sends HTTP/1 requests, without a plugin. (and I did not
> use the plugin)
> So, this result should not be affected by HTTP/2.
>
> Also, according to a JIRA (
> https://issues.apache.org/jira/browse/SOLR-13289 ).
> Solr8 has not supported block-max WAND algorithm yet, so again this result
> should not be affected by the algorithm, which makes Lucene faster.
>
> The results from Jmeter is attached as a PDF file.
>
> According to these results, Solr8 is somehow superior then Solr7.4.0.
>
> But, I have no idea what are the considerable causes of this difference.
> Does anyone have any idea about this?
>
>
> Sincerely,
> Kaya Ota
>
-- 
*Best regards,*
*Cao Mạnh Đạt*


*D.O.B : 31-07-1991Cell: (+84) 946.328.329E-mail: caomanhdat...@gmail.com
*