Re: Issue with scoreNodes stream expression

2017-09-19 Thread Joel Bernstein
Have you tried running a very simple expression first. For example does
this run:

random(gettingstarted, q="*:*", fl="id", rows="200")



Joel Bernstein
http://joelsolr.blogspot.com/

On Tue, Sep 19, 2017 at 4:56 PM, Aurélien MAZOYER <
aurelien.mazo...@francelabs.com> wrote:

> Hi,
>
>
>
> I wanted to try the new scoreNodes stream expression that is used to make
> recommendations:
>
> https://cwiki.apache.org/confluence/display/solr/Graph+
> Traversal#GraphTraver
> sal-UsingthescoreNodesFunctiontoMakeaRecommendation
>
> but encountered some issue with it.
>
>
>
> The following steps can easily reproduce the problem:
>
> I started Solr (6.6.1) in cloud mode :
>
> solr -e cloud -noprompt
>
> then run the following command in exampledocs to index the sample data :
>
> java -Dc=gettingstarted -jar post.jar *.xml
>
> and to finish copy/paste the following expression in the stream tab:
>
> scoreNodes(top(n=25,
>
>   sort="count(*) desc",
>
>   nodes(gettingstarted,
>
>  random(gettingstarted, q="*:*",
> fl="id", rows="200"),
>
>  walk="id->id",
>
>  gather="id",
>
>  count(*
>
> (yes I now that my stream expression does nothing usefull :-P).
>
> Anyway, I got the following exception when I run the query:
>
> "EXCEPTION": "org.apache.solr.client.solrj.SolrServerException: No
> collection param specified on request and no default collection has been
> set.",
>
> Any idea of what i did wrong?
>
>
>
> Thank you,
>
>
>
> Regards,
>
>
>
> Aurélien
>
>
>
>
>
>
>
>


Re: Solr Streaming Question

2017-09-19 Thread Joel Bernstein
Also random() will work with any type of field. Only the /export handler
limits the field list to docValues.

Each time you call random() it will give you a different random sample.

Joel Bernstein
http://joelsolr.blogspot.com/

On Tue, Sep 19, 2017 at 10:04 PM, Joel Bernstein  wrote:

> Try this construct:
>
> update(list(random(...), random(...), random(...)))
>
>
>
>
>
>
>
> Joel Bernstein
> http://joelsolr.blogspot.com/
>
> On Tue, Sep 19, 2017 at 9:02 PM, Susheel Kumar 
> wrote:
>
>> You can follow the sectionCreating an Alert With the Topic Streaming
>> Expression" at http://joelsolr.blogspot.com/  and use random function for
>> getting random records and schedule using daemon function to retrieve
>> periodically etc.
>>
>> Thanks,
>> Susheel
>>
>>
>>
>> On Tue, Sep 19, 2017 at 4:56 PM, Erick Erickson 
>> wrote:
>>
>> > Webster:
>> >
>> > I think you're looking for UpdateStream. Unfortunately the fix version
>> > wasn't entered so you'll have to look at your particular version but
>> > going strictly from the dates it appears in 6.0.
>> >
>> > David:
>> >
>> > Stored is irrelevant. Streaming only works with docValues="true"
>> > fields and moves the docValues content over.
>> >
>> > Best,
>> > Erick
>> >
>> > On Tue, Sep 19, 2017 at 12:39 PM, David Hastings
>> >  wrote:
>> > > I am also curious about this, specifically about indexed/non stored
>> > fields.
>> > >
>> > > On Tue, Sep 19, 2017 at 3:33 PM, Webster Homer <
>> webster.ho...@sial.com>
>> > > wrote:
>> > >
>> > >> Is it possible to use the streaming API to stream documents from a
>> > >> collection and load them into a new collection? I was thinking that
>> this
>> > >> would be a great way to get a random sample of data from our main
>> > >> collections to developer machines. Making it a random sample would be
>> > >> useful as well. This looks feasible, but I've only scratched the
>> > surface of
>> > >> streaming Solr
>> > >>
>> > >> Thanks
>> > >>
>> > >> --
>> > >>
>> > >>
>> > >> This message and any attachment are confidential and may be
>> privileged
>> > or
>> > >> otherwise protected from disclosure. If you are not the intended
>> > recipient,
>> > >> you must not copy this message or attachment or disclose the
>> contents to
>> > >> any other person. If you have received this transmission in error,
>> > please
>> > >> notify the sender immediately and delete the message and any
>> attachment
>> > >> from your system. Merck KGaA, Darmstadt, Germany and any of its
>> > >> subsidiaries do not accept liability for any omissions or errors in
>> this
>> > >> message which may arise as a result of E-Mail-transmission or for
>> > damages
>> > >> resulting from any unauthorized changes of the content of this
>> message
>> > and
>> > >> any attachment thereto. Merck KGaA, Darmstadt, Germany and any of its
>> > >> subsidiaries do not guarantee that this message is free of viruses
>> and
>> > does
>> > >> not accept liability for any damages caused by any virus transmitted
>> > >> therewith.
>> > >>
>> > >> Click http://www.emdgroup.com/disclaimer to access the German,
>> French,
>> > >> Spanish and Portuguese versions of this disclaimer.
>> > >>
>> >
>>
>
>


Re: Solr Streaming Question

2017-09-19 Thread Joel Bernstein
Try this construct:

update(list(random(...), random(...), random(...)))







Joel Bernstein
http://joelsolr.blogspot.com/

On Tue, Sep 19, 2017 at 9:02 PM, Susheel Kumar 
wrote:

> You can follow the sectionCreating an Alert With the Topic Streaming
> Expression" at http://joelsolr.blogspot.com/  and use random function for
> getting random records and schedule using daemon function to retrieve
> periodically etc.
>
> Thanks,
> Susheel
>
>
>
> On Tue, Sep 19, 2017 at 4:56 PM, Erick Erickson 
> wrote:
>
> > Webster:
> >
> > I think you're looking for UpdateStream. Unfortunately the fix version
> > wasn't entered so you'll have to look at your particular version but
> > going strictly from the dates it appears in 6.0.
> >
> > David:
> >
> > Stored is irrelevant. Streaming only works with docValues="true"
> > fields and moves the docValues content over.
> >
> > Best,
> > Erick
> >
> > On Tue, Sep 19, 2017 at 12:39 PM, David Hastings
> >  wrote:
> > > I am also curious about this, specifically about indexed/non stored
> > fields.
> > >
> > > On Tue, Sep 19, 2017 at 3:33 PM, Webster Homer  >
> > > wrote:
> > >
> > >> Is it possible to use the streaming API to stream documents from a
> > >> collection and load them into a new collection? I was thinking that
> this
> > >> would be a great way to get a random sample of data from our main
> > >> collections to developer machines. Making it a random sample would be
> > >> useful as well. This looks feasible, but I've only scratched the
> > surface of
> > >> streaming Solr
> > >>
> > >> Thanks
> > >>
> > >> --
> > >>
> > >>
> > >> This message and any attachment are confidential and may be privileged
> > or
> > >> otherwise protected from disclosure. If you are not the intended
> > recipient,
> > >> you must not copy this message or attachment or disclose the contents
> to
> > >> any other person. If you have received this transmission in error,
> > please
> > >> notify the sender immediately and delete the message and any
> attachment
> > >> from your system. Merck KGaA, Darmstadt, Germany and any of its
> > >> subsidiaries do not accept liability for any omissions or errors in
> this
> > >> message which may arise as a result of E-Mail-transmission or for
> > damages
> > >> resulting from any unauthorized changes of the content of this message
> > and
> > >> any attachment thereto. Merck KGaA, Darmstadt, Germany and any of its
> > >> subsidiaries do not guarantee that this message is free of viruses and
> > does
> > >> not accept liability for any damages caused by any virus transmitted
> > >> therewith.
> > >>
> > >> Click http://www.emdgroup.com/disclaimer to access the German,
> French,
> > >> Spanish and Portuguese versions of this disclaimer.
> > >>
> >
>


Cannot load LTRQParserPlugin inot my core

2017-09-19 Thread Billy Chan
Hi !
I am having this issue:
• org.apache.solr.common.SolrException:org.apache.solr.common.SolrException: 
Error loading class 'org.apache.solr.ltr.search.LTRQParserPlugin'

I  have set up 

in my solrconfig.xml and my solr version is 6.5.0

I can see there is no jar in LTR contrib…so I didn’t set 

Re: Solr Streaming Question

2017-09-19 Thread Susheel Kumar
You can follow the sectionCreating an Alert With the Topic Streaming
Expression" at http://joelsolr.blogspot.com/  and use random function for
getting random records and schedule using daemon function to retrieve
periodically etc.

Thanks,
Susheel



On Tue, Sep 19, 2017 at 4:56 PM, Erick Erickson 
wrote:

> Webster:
>
> I think you're looking for UpdateStream. Unfortunately the fix version
> wasn't entered so you'll have to look at your particular version but
> going strictly from the dates it appears in 6.0.
>
> David:
>
> Stored is irrelevant. Streaming only works with docValues="true"
> fields and moves the docValues content over.
>
> Best,
> Erick
>
> On Tue, Sep 19, 2017 at 12:39 PM, David Hastings
>  wrote:
> > I am also curious about this, specifically about indexed/non stored
> fields.
> >
> > On Tue, Sep 19, 2017 at 3:33 PM, Webster Homer 
> > wrote:
> >
> >> Is it possible to use the streaming API to stream documents from a
> >> collection and load them into a new collection? I was thinking that this
> >> would be a great way to get a random sample of data from our main
> >> collections to developer machines. Making it a random sample would be
> >> useful as well. This looks feasible, but I've only scratched the
> surface of
> >> streaming Solr
> >>
> >> Thanks
> >>
> >> --
> >>
> >>
> >> This message and any attachment are confidential and may be privileged
> or
> >> otherwise protected from disclosure. If you are not the intended
> recipient,
> >> you must not copy this message or attachment or disclose the contents to
> >> any other person. If you have received this transmission in error,
> please
> >> notify the sender immediately and delete the message and any attachment
> >> from your system. Merck KGaA, Darmstadt, Germany and any of its
> >> subsidiaries do not accept liability for any omissions or errors in this
> >> message which may arise as a result of E-Mail-transmission or for
> damages
> >> resulting from any unauthorized changes of the content of this message
> and
> >> any attachment thereto. Merck KGaA, Darmstadt, Germany and any of its
> >> subsidiaries do not guarantee that this message is free of viruses and
> does
> >> not accept liability for any damages caused by any virus transmitted
> >> therewith.
> >>
> >> Click http://www.emdgroup.com/disclaimer to access the German, French,
> >> Spanish and Portuguese versions of this disclaimer.
> >>
>


RE: Using SOLR J 5.5.4 with SOLR 6.5

2017-09-19 Thread Felix Stanley
Thanks so much for all the answer, gonna test it out then..


Best Regards,

Felix Stanley



-Original Message-
From: Walter Underwood [mailto:wun...@wunderwood.org] 
Sent: Wednesday, September 20, 2017 1:05 AM
To: solr-user@lucene.apache.org
Subject: Re: Using SOLR J 5.5.4 with SOLR 6.5

As I understand it, any node in the cluster will direct the document to the 
leader for the appropriate shard.

Works for us.

wunder
Walter Underwood
wun...@wunderwood.org
http://observer.wunderwood.org/  (my blog)


> On Sep 19, 2017, at 9:59 AM, David Hastings  
> wrote:
> 
> Thanks! Going to have to throw up another solr 6.x instance for 
> testing again.  Solr cloud will maintain index integrity across the 
> nodes if indexed to just one node correct?
> 
> On Tue, Sep 19, 2017 at 12:55 PM, Walter Underwood 
> 
> wrote:
> 
>> Yes, good old HTTP.
>> 
>> wunder
>> Walter Underwood
>> wun...@wunderwood.org
>> http://observer.wunderwood.org/  (my blog)
>> 
>> 
>>> On Sep 19, 2017, at 9:54 AM, David Hastings <
>> hastings.recurs...@gmail.com> wrote:
>>> 
>>> Do you use HttpSolrClient then?
>>> 
>>> On Tue, Sep 19, 2017 at 12:26 PM, Walter Underwood <
>> wun...@wunderwood.org>
>>> wrote:
>>> 
 We run SolrJ 4.7.1 with Solr 6.5.1 (16 node cloud). No problems.
 
 We do not use the cloud-specific client and I’m pretty sure that we
>> don’t
 use ConcurrentUpdateSolrServer. The latter is because it doesn’t 
 report errors properly.
 
 We do our indexing through the load balancer and let the Solr Cloud 
 cluster get the right docs to the right shards. That runs at 1 
 million docs/minute, so it isn’t worth doing anything fancier.
 
 wunder
 Walter Underwood
 wun...@wunderwood.org
 http://observer.wunderwood.org/  (my blog)
 
 
> On Sep 19, 2017, at 9:05 AM, David Hastings <
 hastings.recurs...@gmail.com> wrote:
> 
> What about the ConcurrentUpdateSolrServer for solrj?  That is what
>> almost
> all of my indexing code is using for solr 5.x, Its been a while 
> since I experimented with upgrading but i seem to remember having 
> to go to HttpSolrClient and couldnt get the code to compile, so i 
> tabled the experiment for a while.  eventually I will need to move 
> to solr 6, but
 if i
> could keep the same indexing code that would be ideal
> 
> On Tue, Sep 19, 2017 at 11:59 AM, Erick Erickson <
 erickerick...@gmail.com>
> wrote:
> 
>> Felix:
>> 
>> There's no specific testing that I know of for this issue, it's 
>> "best effort". Which means it _should_ work but I can't make promises.
>> 
>> Now that said, underlying it all is just HTTP requests going back 
>> and forth so I know of no a-priori reasons it wouldn't be fine. 
>> It's just "try it and see" though.
>> 
>> Best,
>> Erick
>> 
>> I'm probably preaching to the choir, but Java 1.7 is two years 
>> past the end of support from Oracle, somebody sometime has to 
>> deal with upgrading.
>> 
>> On Mon, Sep 18, 2017 at 10:47 PM, Felix Stanley 
>>  wrote:
>>> Hi there,
>>> 
>>> 
>>> 
>>> We are planning to use SOLR J 5.5.4 to query from SOLR 6.5.
>>> 
>>> The reason was that we have to rely on JDK 1.7 at the client and 
>>> as
>> far
>> as I
>>> know SOLR J 6.x.x only support JDK 1.8.
>>> 
>>> I understood that SOLR J generally maintains backwards/forward
>> compatibility
>>> from this article:
>>> 
>>> 
>>> 
>>> https://wiki.apache.org/solr/Solrj
>>> 
>>> 
>>> 
>>> Would there though be any exception that we need to take caution 
>>> of
>> for
>> this
>>> specific version?
>>> 
>>> 
>>> 
>>> Thanks a lot.
>>> 
>>> 
>>> 
>>> 
>>> 
>>> Best Regards,
>>> 
>>> 
>>> 
>>> Felix Stanley
>>> 
>>> 
>>> 
>>> 
>>> --
>>> CONFIDENTIALITY NOTICE
>>> 
>>> This e-mail (including any attachments) may contain confidential
>> and/or
>> privileged information. If you are not the intended recipient or 
>> have received this e-mail in error, please inform the sender 
>> immediately
>> and
>> delete this e-mail (including any attachments) from your 
>> computer, and
 you
>> must not use, disclose to anyone else or copy this e-mail 
>> (including
>> any
>> attachments), whether in whole or in part.
>>> 
>>> This e-mail and any reply to it may be monitored for security, 
>>> legal,
>> regulatory compliance and/or other appropriate reasons.
>> 
 
 
>> 
>> 



--
CONFIDENTIALITY NOTICE 

This e-mail (including any attachments) may contain confidential and/or 
privileged information. If you are not the intended recipient or have received 
this e-mail in error, please inform the sender immediately and delete this 
e

question about an entry in the log file

2017-09-19 Thread kaveh minooie

Hi eveyone

I am trying to figure out why calling commit from my client takes a very 
long time in an environment with concurrent updates, and I see the 
following snippet in the solr log files when client calls for commit. my 
question is regarding the third info. what is it opening? and how can 
make solr to stop doing that?



INFO  - 2017-09-19 16:42:20.557; [c:dosweb2016 s:shard2 r:core_node5 
x:dosweb2016] org.apache.solr.update.DirectUpdateHandler2; start 
commit{,optimize=false,openSearcher=false,waitSearcher=true,expungeDeletes=false,softCommit=false,prepareCommit=false}
INFO  - 2017-09-19 16:42:20.557; [c:dosweb2016 s:shard2 r:core_node5 
x:dosweb2016] org.apache.solr.update.SolrIndexWriter; Calling 
setCommitData with IW:org.apache.solr.update.SolrIndexWriter@3ee73284
INFO  - 2017-09-19 16:42:20.660; [c:dosweb2016 s:shard2 r:core_node5 
x:dosweb2016] org.apache.solr.search.SolrIndexSearcher; Opening 
[Searcher@644a8d33[dosweb2016] realtime]
INFO  - 2017-09-19 16:42:20.668; [c:dosweb2016 s:shard2 r:core_node5 
x:dosweb2016] org.apache.solr.update.DirectUpdateHandler2; end_commit_flush



thanks,
--
Kaveh Minooie


Solr replication

2017-09-19 Thread Satyaprashant Bezwada
Need some inputs or help in resolving replication across solr nodes. We have 
installed Solr 6.5 in cloud mode and have 3 ZooKeepers and 2 Solr nodes 
configured. Enabled Solr replication in my Solrj client but the replication 
fails and is unable to create a collection.

The same code works in our different environment where in I have 1 zookeeper 
and 3 Solr nodes configured. Here is the exception I see on one of the nodes of 
Solr when I try to create a collection in the environment where it fails. I 
have compared the Solrconfig.xml on both the environments and didn’t see any 
difference.



2017-09-19 22:09:35.471 ERROR 
(OverseerThreadFactory-8-thread-1-processing-n:sr01:8983_solr) [   ] 
o.a.s.c.OverseerCollectionMessageHandler Cleaning up collection [CIUZLW].
2017-09-19 22:09:35.475 INFO  
(OverseerThreadFactory-8-thread-1-processing-n:sr01:8983_solr) [   ] 
o.a.s.c.OverseerCollectionMessageHandler Executing Collection Cmd : 
action=UNLOAD&deleteInstanceDir=true&deleteDataDir=true
2017-09-19 22:09:35.486 INFO  (qtp401424608-15) [   ] o.a.s.s.HttpSolrCall 
[admin] webapp=null path=/admin/cores 
params={deleteInstanceDir=true&core=CIUZLW_shard1_replica1&qt=/admin/cores&deleteDataDir=true&action=UNLOAD&wt=javabin&version=2}
 status=0 QTime=6
2017-09-19 22:09:36.194 INFO  
(OverseerThreadFactory-8-thread-1-processing-n:sr01:8983_solr) [   ] 
o.a.s.c.CreateCollectionCmd Cleaned up artifacts for failed create collection 
for [CIUZLW]
2017-09-19 22:09:38.008 INFO  
(OverseerCollectionConfigSetProcessor-170740497916499012-sr01:8983_solr-n_29)
 [   ] o.a.s.c.OverseerTaskQueue Response ZK path: 
/overseer/collection-queue-work/qnr-000410 doesn't exist.  Requestor may 
have disconnected from ZooKeeper
2017-09-19 22:38:36.634 INFO  (ShutdownMonitor) [   ] o.a.s.c.CoreContainer 
Shutting down CoreContainer instance=1549725679
2017-09-19 22:38:36.644 INFO  (ShutdownMonitor) [   ] o.a.s.c.Overseer Overseer 
(id=170740497916499012-sr01:8983_solr-n_29) closing
2017-09-19 22:38:36.645 INFO  
(OverseerStateUpdate-170740497916499012-sr01:8983_solr-n_29) [   ] 
o.a.s.c.Overseer Overseer Loop exiting : sr01:8983_solr
2017-09-19 22:38:36.654 INFO  (ShutdownMonitor) [   ] o.a.s.m.SolrMetricManager 
Closing metric reporters for: solr.node
^CCorp-QA-West [sbezw...@boardvantage.net@sr1501:1 ~]$
Corp-QA-West [sbezw...@boardvantage.net@sr1501:1 ~]$ sudo tail -f 
/var/solr/logs/solr.log
at java.lang.Thread.run(Thread.java:745)
2017-09-19 22:09:35.471 ERROR 
(OverseerThreadFactory-8-thread-1-processing-n:sr01:8983_solr) [   ] 
o.a.s.c.OverseerCollectionMessageHandler Cleaning up collection [CIUZLW].
2017-09-19 22:09:35.475 INFO  
(OverseerThreadFactory-8-thread-1-processing-n:sr01:8983_solr) [   ] 
o.a.s.c.OverseerCollectionMessageHandler Executing Collection Cmd : 
action=UNLOAD&deleteInstanceDir=true&deleteDataDir=true
2017-09-19 22:09:35.486 INFO  (qtp401424608-15) [   ] o.a.s.s.HttpSolrCall 
[admin] webapp=null path=/admin/cores 
params={deleteInstanceDir=true&core=CIUZLW_shard1_replica1&qt=/admin/cores&deleteDataDir=true&action=UNLOAD&wt=javabin&version=2}
 status=0 QTime=6
2017-09-19 22:09:36.194 INFO  
(OverseerThreadFactory-8-thread-1-processing-n:sr01:8983_solr) [   ] 
o.a.s.c.CreateCollectionCmd Cleaned up artifacts for failed create collection 
for [CIUZLW]
2017-09-19 22:09:38.008 INFO  
(OverseerCollectionConfigSetProcessor-170740497916499012-sr01:8983_solr-n_29)
 [   ] o.a.s.c.OverseerTaskQueue Response ZK path: 
/overseer/collection-queue-work/qnr-000410 doesn't exist.  Requestor may 
have disconnected from ZooKeeper
2017-09-19 22:38:36.634 INFO  (ShutdownMonitor) [   ] o.a.s.c.CoreContainer 
Shutting down CoreContainer instance=1549725679
2017-09-19 22:38:36.644 INFO  (ShutdownMonitor) [   ] o.a.s.c.Overseer Overseer 
(id=170740497916499012-sr01:8983_solr-n_29) closing
2017-09-19 22:38:36.645 INFO  
(OverseerStateUpdate-170740497916499012-sr01:8983_solr-n_29) [   ] 
o.a.s.c.Overseer Overseer Loop exiting : sr01:8983_solr
2017-09-19 22:38:36.654 INFO  (ShutdownMonitor) [   ] o.a.s.m.SolrMetricManager 
Closing metric reporters for: solr.node
^CCorp-QA-West [sbezw...@boardvantage.net@sr1501:1 ~]$ sudo tail -f 
/var/solr/logs/solr.log
2017-09-19 23:12:22.230 INFO  (main) [   ] o.a.s.s.SolrDispatchFilter Loading 
solr.xml from SolrHome (not found in ZooKeeper)
2017-09-19 23:12:22.233 INFO  (main) [   ] o.a.s.c.SolrXmlConfig Loading 
container configuration from /var/solr/data/Node122/solr.xml
2017-09-19 23:12:22.644 INFO  (main) [   ] o.a.s.u.UpdateShardHandler Creating 
UpdateShardHandler HTTP client with params: 
socketTimeout=60&connTimeout=6&retry=true
2017-09-19 23:12:22.650 INFO  (main) [   ] o.a.s.c.ZkContainer Zookeeper 
client=zk01:2181,zk02:2181,zk03:2181
2017-09-19 23:12:22.731 INFO  (main) [   ] o.a.s.c.c.ZkStateReader Updated live 
nodes from ZooKeeper... (0) -> (1)
2017-09-19 23:12:22.762 INFO  (main) [   ] o.a.s.c.Overseer Ove

Re: Solrcloud configuration

2017-09-19 Thread John Bickerstaff
Addendum:  It's not sql server, but I imagine the steps will be similar if
not identical except for the details of the JDBC driver you need.

On Tue, Sep 19, 2017 at 4:11 PM, John Bickerstaff 
wrote:

> This may also be of some assistance:
>
> https://gist.github.com/maxivak/3e3ee1fca32f3949f052
>
> I haven't tested, just found it.
>
> On Tue, Sep 19, 2017 at 4:10 PM, John Bickerstaff <
> j...@johnbickerstaff.com> wrote:
>
>> This may be of some assistance...
>>
>> http://lucene.apache.org/solr/guide/6_6/
>>
>> There is a section discussing sharding and another section that includes
>> the schema.
>>
>> On Tue, Sep 19, 2017 at 1:42 PM, Shashi Roushan 
>> wrote:
>>
>>> Hello David
>>>
>>> No, I didn't read any documentation on the schema and DIH.
>>>
>>> Actually we are already using Solr 4 version. I am now upgrading in
>>> solrcloud with shards. I have done lots of google, but not getting
>>> relevant
>>> information DIH and schema with Solr shards. I am getting result with
>>> older
>>> version of Solr.
>>>
>>>
>>> On Sep 20, 2017 12:58 AM, "David Hastings" >> >
>>> wrote:
>>>
>>> Did you read the documentation on the schema and the DIH?
>>>
>>> On Tue, Sep 19, 2017 at 3:04 PM, Shashi Roushan >> >
>>> wrote:
>>>
>>> > Hi All,
>>> >
>>> > I need your help to configure solrcloud with shards.
>>> > I have created collection and shards using solr6 and Zookeeper. Its
>>> working
>>> > fine.
>>> > My problems are:
>>> > Where I put schema and dbdataconfig files?
>>> > How I can use DIH to import data from SQL server To solr?
>>> >  In older version I was using schema and DIH to import data from SQL
>>> > server.
>>> >
>>> > Please help.
>>> >
>>> > Regards
>>> > Shashi Roushan
>>> >
>>>
>>
>>
>


Re: Solrcloud configuration

2017-09-19 Thread John Bickerstaff
This may also be of some assistance:

https://gist.github.com/maxivak/3e3ee1fca32f3949f052

I haven't tested, just found it.

On Tue, Sep 19, 2017 at 4:10 PM, John Bickerstaff 
wrote:

> This may be of some assistance...
>
> http://lucene.apache.org/solr/guide/6_6/
>
> There is a section discussing sharding and another section that includes
> the schema.
>
> On Tue, Sep 19, 2017 at 1:42 PM, Shashi Roushan 
> wrote:
>
>> Hello David
>>
>> No, I didn't read any documentation on the schema and DIH.
>>
>> Actually we are already using Solr 4 version. I am now upgrading in
>> solrcloud with shards. I have done lots of google, but not getting
>> relevant
>> information DIH and schema with Solr shards. I am getting result with
>> older
>> version of Solr.
>>
>>
>> On Sep 20, 2017 12:58 AM, "David Hastings" 
>> wrote:
>>
>> Did you read the documentation on the schema and the DIH?
>>
>> On Tue, Sep 19, 2017 at 3:04 PM, Shashi Roushan 
>> wrote:
>>
>> > Hi All,
>> >
>> > I need your help to configure solrcloud with shards.
>> > I have created collection and shards using solr6 and Zookeeper. Its
>> working
>> > fine.
>> > My problems are:
>> > Where I put schema and dbdataconfig files?
>> > How I can use DIH to import data from SQL server To solr?
>> >  In older version I was using schema and DIH to import data from SQL
>> > server.
>> >
>> > Please help.
>> >
>> > Regards
>> > Shashi Roushan
>> >
>>
>
>


Re: Solrcloud configuration

2017-09-19 Thread John Bickerstaff
This may be of some assistance...

http://lucene.apache.org/solr/guide/6_6/

There is a section discussing sharding and another section that includes
the schema.

On Tue, Sep 19, 2017 at 1:42 PM, Shashi Roushan 
wrote:

> Hello David
>
> No, I didn't read any documentation on the schema and DIH.
>
> Actually we are already using Solr 4 version. I am now upgrading in
> solrcloud with shards. I have done lots of google, but not getting relevant
> information DIH and schema with Solr shards. I am getting result with older
> version of Solr.
>
>
> On Sep 20, 2017 12:58 AM, "David Hastings" 
> wrote:
>
> Did you read the documentation on the schema and the DIH?
>
> On Tue, Sep 19, 2017 at 3:04 PM, Shashi Roushan 
> wrote:
>
> > Hi All,
> >
> > I need your help to configure solrcloud with shards.
> > I have created collection and shards using solr6 and Zookeeper. Its
> working
> > fine.
> > My problems are:
> > Where I put schema and dbdataconfig files?
> > How I can use DIH to import data from SQL server To solr?
> >  In older version I was using schema and DIH to import data from SQL
> > server.
> >
> > Please help.
> >
> > Regards
> > Shashi Roushan
> >
>


Re: How to remove control characters in stored value at Solr side

2017-09-19 Thread Shawn Heisey
On 9/18/2017 12:45 PM, Markus Jelsma wrote:
> But, can you then explain why Apache Nutch with SolrJ had this problem? It 
> seems that by default SolrJ does use XML as transport format. We have always 
> used SolrJ which i assumed would default to javabin, but we had this exact 
> problem anyway, and solved it by stripping non-character code points.
>
> When we use SolrJ for querying we clearly see wt=javabin in the logs, but 
> updates showed the problem. Can we fix it anywhere?

The wt parameter controls the *response*, not the *request*.

The cloud client started using javabin by default for requests in
version 4.6 (SOLR-5223), but the HTTP client used XML for requests by
default up until version 5.5 (SOLR-8595).  The current trunk Nutch code
is using SolrJ 5.4.1 and HttpSolrClient, which means that Nutch is
sending XML to Solr.  The wt parameter on those requests is javabin, so
the response that Solr sends back is binary.

SolrJ should handle translating the input so that it's valid XML, but
maybe there are characters that SolrJ's XML request writer doesn't (or
can't) handle correctly.

Thanks,
Shawn



Re: Solr Streaming Question

2017-09-19 Thread Erick Erickson
Webster:

I think you're looking for UpdateStream. Unfortunately the fix version
wasn't entered so you'll have to look at your particular version but
going strictly from the dates it appears in 6.0.

David:

Stored is irrelevant. Streaming only works with docValues="true"
fields and moves the docValues content over.

Best,
Erick

On Tue, Sep 19, 2017 at 12:39 PM, David Hastings
 wrote:
> I am also curious about this, specifically about indexed/non stored fields.
>
> On Tue, Sep 19, 2017 at 3:33 PM, Webster Homer 
> wrote:
>
>> Is it possible to use the streaming API to stream documents from a
>> collection and load them into a new collection? I was thinking that this
>> would be a great way to get a random sample of data from our main
>> collections to developer machines. Making it a random sample would be
>> useful as well. This looks feasible, but I've only scratched the surface of
>> streaming Solr
>>
>> Thanks
>>
>> --
>>
>>
>> This message and any attachment are confidential and may be privileged or
>> otherwise protected from disclosure. If you are not the intended recipient,
>> you must not copy this message or attachment or disclose the contents to
>> any other person. If you have received this transmission in error, please
>> notify the sender immediately and delete the message and any attachment
>> from your system. Merck KGaA, Darmstadt, Germany and any of its
>> subsidiaries do not accept liability for any omissions or errors in this
>> message which may arise as a result of E-Mail-transmission or for damages
>> resulting from any unauthorized changes of the content of this message and
>> any attachment thereto. Merck KGaA, Darmstadt, Germany and any of its
>> subsidiaries do not guarantee that this message is free of viruses and does
>> not accept liability for any damages caused by any virus transmitted
>> therewith.
>>
>> Click http://www.emdgroup.com/disclaimer to access the German, French,
>> Spanish and Portuguese versions of this disclaimer.
>>


Issue with scoreNodes stream expression

2017-09-19 Thread Aurélien MAZOYER
Hi,

 

I wanted to try the new scoreNodes stream expression that is used to make
recommendations:

https://cwiki.apache.org/confluence/display/solr/Graph+Traversal#GraphTraver
sal-UsingthescoreNodesFunctiontoMakeaRecommendation

but encountered some issue with it.

 

The following steps can easily reproduce the problem:

I started Solr (6.6.1) in cloud mode : 

solr -e cloud -noprompt

then run the following command in exampledocs to index the sample data :

java -Dc=gettingstarted -jar post.jar *.xml

and to finish copy/paste the following expression in the stream tab:

scoreNodes(top(n=25,

  sort="count(*) desc",

  nodes(gettingstarted,

 random(gettingstarted, q="*:*",
fl="id", rows="200"),

 walk="id->id",

 gather="id",

 count(*

(yes I now that my stream expression does nothing usefull :-P).

Anyway, I got the following exception when I run the query:

"EXCEPTION": "org.apache.solr.client.solrj.SolrServerException: No
collection param specified on request and no default collection has been
set.",

Any idea of what i did wrong?

 

Thank you,

 

Regards,

 

Aurélien

 

 

 



Re: Solrcloud configuration

2017-09-19 Thread Shashi Roushan
Hello David

No, I didn't read any documentation on the schema and DIH.

Actually we are already using Solr 4 version. I am now upgrading in
solrcloud with shards. I have done lots of google, but not getting relevant
information DIH and schema with Solr shards. I am getting result with older
version of Solr.


On Sep 20, 2017 12:58 AM, "David Hastings" 
wrote:

Did you read the documentation on the schema and the DIH?

On Tue, Sep 19, 2017 at 3:04 PM, Shashi Roushan 
wrote:

> Hi All,
>
> I need your help to configure solrcloud with shards.
> I have created collection and shards using solr6 and Zookeeper. Its
working
> fine.
> My problems are:
> Where I put schema and dbdataconfig files?
> How I can use DIH to import data from SQL server To solr?
>  In older version I was using schema and DIH to import data from SQL
> server.
>
> Please help.
>
> Regards
> Shashi Roushan
>


Re: Solr Streaming Question

2017-09-19 Thread David Hastings
I am also curious about this, specifically about indexed/non stored fields.

On Tue, Sep 19, 2017 at 3:33 PM, Webster Homer 
wrote:

> Is it possible to use the streaming API to stream documents from a
> collection and load them into a new collection? I was thinking that this
> would be a great way to get a random sample of data from our main
> collections to developer machines. Making it a random sample would be
> useful as well. This looks feasible, but I've only scratched the surface of
> streaming Solr
>
> Thanks
>
> --
>
>
> This message and any attachment are confidential and may be privileged or
> otherwise protected from disclosure. If you are not the intended recipient,
> you must not copy this message or attachment or disclose the contents to
> any other person. If you have received this transmission in error, please
> notify the sender immediately and delete the message and any attachment
> from your system. Merck KGaA, Darmstadt, Germany and any of its
> subsidiaries do not accept liability for any omissions or errors in this
> message which may arise as a result of E-Mail-transmission or for damages
> resulting from any unauthorized changes of the content of this message and
> any attachment thereto. Merck KGaA, Darmstadt, Germany and any of its
> subsidiaries do not guarantee that this message is free of viruses and does
> not accept liability for any damages caused by any virus transmitted
> therewith.
>
> Click http://www.emdgroup.com/disclaimer to access the German, French,
> Spanish and Portuguese versions of this disclaimer.
>


Solr Streaming Question

2017-09-19 Thread Webster Homer
Is it possible to use the streaming API to stream documents from a
collection and load them into a new collection? I was thinking that this
would be a great way to get a random sample of data from our main
collections to developer machines. Making it a random sample would be
useful as well. This looks feasible, but I've only scratched the surface of
streaming Solr

Thanks

-- 


This message and any attachment are confidential and may be privileged or 
otherwise protected from disclosure. If you are not the intended recipient, 
you must not copy this message or attachment or disclose the contents to 
any other person. If you have received this transmission in error, please 
notify the sender immediately and delete the message and any attachment 
from your system. Merck KGaA, Darmstadt, Germany and any of its 
subsidiaries do not accept liability for any omissions or errors in this 
message which may arise as a result of E-Mail-transmission or for damages 
resulting from any unauthorized changes of the content of this message and 
any attachment thereto. Merck KGaA, Darmstadt, Germany and any of its 
subsidiaries do not guarantee that this message is free of viruses and does 
not accept liability for any damages caused by any virus transmitted 
therewith.

Click http://www.emdgroup.com/disclaimer to access the German, French, 
Spanish and Portuguese versions of this disclaimer.


Re: Solrcloud configuration

2017-09-19 Thread David Hastings
Did you read the documentation on the schema and the DIH?

On Tue, Sep 19, 2017 at 3:04 PM, Shashi Roushan 
wrote:

> Hi All,
>
> I need your help to configure solrcloud with shards.
> I have created collection and shards using solr6 and Zookeeper. Its working
> fine.
> My problems are:
> Where I put schema and dbdataconfig files?
> How I can use DIH to import data from SQL server To solr?
>  In older version I was using schema and DIH to import data from SQL
> server.
>
> Please help.
>
> Regards
> Shashi Roushan
>


Re: Solr nodes crashing (OOM) after 6.6 upgrade

2017-09-19 Thread Susheel Kumar
+1. Asking for way more than anything you need may result into OOM.  rows
and facet.limit should be carefully passed.

On Tue, Sep 19, 2017 at 1:23 PM, Toke Eskildsen  wrote:

> shamik  wrote:
> > I've facet.limit=-1 configured for few search types, but facet.mincount
> is
> > always set as 1. Didn't know that's detrimental to doc values.
>
> It is if you have a lot (1000+) of unique values in your facet field,
> especially when you have more than 1 shard. Only ask for the number you
> need. Same goes for rows BTW.
>
> - Toke Eskildsen
>


Solrcloud configuration

2017-09-19 Thread Shashi Roushan
Hi All,

I need your help to configure solrcloud with shards.
I have created collection and shards using solr6 and Zookeeper. Its working
fine.
My problems are:
Where I put schema and dbdataconfig files?
How I can use DIH to import data from SQL server To solr?
 In older version I was using schema and DIH to import data from SQL server.

Please help.

Regards
Shashi Roushan


Re: no search results for specific search in solr 6.6.0

2017-09-19 Thread Erick Erickson
Unfortunately the link you provided goes to "localhost", which isn't accessible.

The very first thing I'd do is go to the admin/analysis page and put
the terms in both the "index" and "query" boxes for the field in
question.
Next, attach &debug=query to the query to see how the query is actually parsed.

My bet: You are using a different stemmer for the two cases and the
actual token in the index is FRao in the problem field, but that's
just a guess.

It often fools people that the field returned in the document (i.e. in
the fl list) is the _stored_ value, not the actual token in the index.
You can also use the TermsComponent to see the actual terms in the
index as well as the admin/schema_browser link.

Best,
Erick


On Tue, Sep 19, 2017 at 9:01 AM, Sascha Tuschinski
 wrote:
> Hello Community,
>
> We are using a Solr Core with Solr 6.6.0 on Windows 10 (latest updates) with 
> field names defined like "f_1179014266_txt". The number in the middle of the 
> name differs for each field we use. For language specific fields we are 
> adding an language specific extension e.g. "f_1179014267_txt_fr", 
> "f_1179014268_txt_de", "f_1179014269_txt_en" and so on.
> We are having the following odd issue within the french "_fr" field only:
> Field
> f_1197829835_txt_fr
> Dynamic Field /
> *_txt_fr
> Type
> text_fr
>
>   *   The saved value which had been added with no problem to the Solr index 
> is "FRaoo".
>   *   When searching within the Solr query tool for 
> "f_1197829839_txt_fr:*FRao*" it returns the items matching the term as seen 
> below - OK.
> {
>   "responseHeader":{
> "status":0,
> "QTime":1,
> "params":{
>   "q":"f_1197829839_txt_fr:*FRao*",
>   "indent":"on",
>   "wt":"json",
>   "_":"1505808887827"}},
>   "response":{"numFound":1,"start":0,"docs":[
>   {
> "id":"129",
> "f_1197829834_txt_en":"EnAir",
> "f_1197829822_txt_de":"Lufti",
> "f_1197829835_txt_fr":"FRaoi",
> "f_1197829836_txt_it":"ITAir",
> "f_1197829799_txt":["Lufti"],
> "f_1197829838_txt_en":"EnAir",
> "f_1197829839_txt_fr":"FRaoo",
> "f_1197829840_txt_it":"ITAir",
> "_version_":1578520424165146624}]
>   }}
>
>   *   When searching for "f_1197829839_txt_fr:*FRaoo*" NO item is found - 
> Wrong!
> {
>   "responseHeader":{
> "status":0,
> "QTime":1,
> "params":{
>   "q":"f_1197829839_txt_fr:*FRaoo*",
>   "indent":"on",
>   "wt":"json",
>   "_":"1505808887827"}},
>   "response":{"numFound":0,"start":0,"docs":[]
>   }}
> When searching for "f_1197829839_txt_fr:FRaoo" (no wildcards) the matching 
> items are found - OK
>
> {
>   "responseHeader":{
> "status":0,
> "QTime":1,
> "params":{
>   "q":"f_1197829839_txt_fr:FRaoo",
>   "indent":"on",
>   "wt":"json",
>   "_":"1505808887827"}},
>   "response":{"numFound":1,"start":0,"docs":[
>   {
> "id":"129",
> "f_1197829834_txt_en":"EnAir",
> "f_1197829822_txt_de":"Lufti",
> "f_1197829835_txt_fr":"FRaoi",
> "f_1197829836_txt_it":"ITAir",
> "f_1197829799_txt":["Lufti"],
> "f_1197829838_txt_en":"EnAir",
> "f_1197829839_txt_fr":"FRaoo",
> "f_1197829840_txt_it":"ITAir",
> "_version_":1578520424165146624}]
>   }}
> If we save exact the same value into a different language field e.g. ending 
> on "_en", means "f_1197829834_txt_en", then the search 
> "f_1197829834_txt_en:*FRaoo*" find all items correctly!
> We have no idea what's wrong here and we even recreated the index and can 
> reproduce this problem all the time. I can only see that the value starts 
> with "FR" and the field extension ends with "fr" but this is not problem for 
> "en", "de" an so on. All fields are used in the same way and have the same 
> field properties.
> Any help or ideas are highly appreciated. I filed a bug for this 
> https://issues.apache.org/jira/browse/SOLR-11367 but had been asked to 
> publish my question here. Thanks for reading.
> Greetings,
> ___
> Sascha Tuschinski
> Manager Quality Assurance // Canto GmbH
> Phone: +49 (0) 30 ­ 390 485 - 41
> E-mail: stuschin...@canto.com
> Web: canto.com
>
> Canto GmbH
> Lietzenburger Str. 46
> 10789 Berlin
> Phone: +49 (0)30 390485-0
> Fax: +49 (0)30 390485-55
> Amtsgericht Berlin-Charlottenburg HRB 88566
> Geschäftsführer: Jack McGannon, Thomas Mockenhaupt
>


Re: no search results for specific search in solr 6.6.0

2017-09-19 Thread Josh Lincoln
Can you provide the fieldType definition for text_fr?

Also, when you use the Analysis page in the admin UI, what tokens are
generated during indexing for FRaoo using the text_fr fieldType?

On Tue, Sep 19, 2017 at 12:01 PM Sascha Tuschinski 
wrote:

> Hello Community,
>
> We are using a Solr Core with Solr 6.6.0 on Windows 10 (latest updates)
> with field names defined like "f_1179014266_txt". The number in the middle
> of the name differs for each field we use. For language specific fields we
> are adding an language specific extension e.g. "f_1179014267_txt_fr",
> "f_1179014268_txt_de", "f_1179014269_txt_en" and so on.
> We are having the following odd issue within the french "_fr" field only:
> Field
> f_1197829835_txt_fr<
> http://localhost:8983/solr/#/test_core/schema?field=f_1197829835_txt_fr>
> Dynamic Field /
> *_txt_fr<
> http://localhost:8983/solr/#/test_core/schema?dynamic-field=*_txt_fr>
> Type
> text_fr
>
>   *   The saved value which had been added with no problem to the Solr
> index is "FRaoo".
>   *   When searching within the Solr query tool for
> "f_1197829839_txt_fr:*FRao*" it returns the items matching the term as seen
> below - OK.
> {
>   "responseHeader":{
> "status":0,
> "QTime":1,
> "params":{
>   "q":"f_1197829839_txt_fr:*FRao*",
>   "indent":"on",
>   "wt":"json",
>   "_":"1505808887827"}},
>   "response":{"numFound":1,"start":0,"docs":[
>   {
> "id":"129",
> "f_1197829834_txt_en":"EnAir",
> "f_1197829822_txt_de":"Lufti",
> "f_1197829835_txt_fr":"FRaoi",
> "f_1197829836_txt_it":"ITAir",
> "f_1197829799_txt":["Lufti"],
> "f_1197829838_txt_en":"EnAir",
> "f_1197829839_txt_fr":"FRaoo",
> "f_1197829840_txt_it":"ITAir",
> "_version_":1578520424165146624}]
>   }}
>
>   *   When searching for "f_1197829839_txt_fr:*FRaoo*" NO item is found -
> Wrong!
> {
>   "responseHeader":{
> "status":0,
> "QTime":1,
> "params":{
>   "q":"f_1197829839_txt_fr:*FRaoo*",
>   "indent":"on",
>   "wt":"json",
>   "_":"1505808887827"}},
>   "response":{"numFound":0,"start":0,"docs":[]
>   }}
> When searching for "f_1197829839_txt_fr:FRaoo" (no wildcards) the matching
> items are found - OK
>
> {
>   "responseHeader":{
> "status":0,
> "QTime":1,
> "params":{
>   "q":"f_1197829839_txt_fr:FRaoo",
>   "indent":"on",
>   "wt":"json",
>   "_":"1505808887827"}},
>   "response":{"numFound":1,"start":0,"docs":[
>   {
> "id":"129",
> "f_1197829834_txt_en":"EnAir",
> "f_1197829822_txt_de":"Lufti",
> "f_1197829835_txt_fr":"FRaoi",
> "f_1197829836_txt_it":"ITAir",
> "f_1197829799_txt":["Lufti"],
> "f_1197829838_txt_en":"EnAir",
> "f_1197829839_txt_fr":"FRaoo",
> "f_1197829840_txt_it":"ITAir",
> "_version_":1578520424165146624}]
>   }}
> If we save exact the same value into a different language field e.g.
> ending on "_en", means "f_1197829834_txt_en", then the search
> "f_1197829834_txt_en:*FRaoo*" find all items correctly!
> We have no idea what's wrong here and we even recreated the index and can
> reproduce this problem all the time. I can only see that the value starts
> with "FR" and the field extension ends with "fr" but this is not problem
> for "en", "de" an so on. All fields are used in the same way and have the
> same field properties.
> Any help or ideas are highly appreciated. I filed a bug for this
> https://issues.apache.org/jira/browse/SOLR-11367 but had been asked to
> publish my question here. Thanks for reading.
> Greetings,
> ___
> Sascha Tuschinski
> Manager Quality Assurance // Canto GmbH
> Phone: +49 (0) 30 ­ 390 485 - 41 <+49%2030%2039048541>
> E-mail: stuschin...@canto.com
> Web: canto.com
>
> Canto GmbH
> Lietzenburger Str. 46
> 10789 Berlin
> Phone: +49 (0)30 390485-0
> Fax: +49 (0)30 390485-55 <+49%2030%2039048555>
> Amtsgericht Berlin-Charlottenburg HRB 88566
> Geschäftsführer: Jack McGannon, Thomas Mockenhaupt
>
>


Re: Solr nodes crashing (OOM) after 6.6 upgrade

2017-09-19 Thread Toke Eskildsen
shamik  wrote:
> I've facet.limit=-1 configured for few search types, but facet.mincount is
> always set as 1. Didn't know that's detrimental to doc values.

It is if you have a lot (1000+) of unique values in your facet field, 
especially when you have more than 1 shard. Only ask for the number you need. 
Same goes for rows BTW.

- Toke Eskildsen


Re: Using SOLR J 5.5.4 with SOLR 6.5

2017-09-19 Thread Walter Underwood
As I understand it, any node in the cluster will direct the document to the 
leader for the appropriate shard.

Works for us.

wunder
Walter Underwood
wun...@wunderwood.org
http://observer.wunderwood.org/  (my blog)


> On Sep 19, 2017, at 9:59 AM, David Hastings  
> wrote:
> 
> Thanks! Going to have to throw up another solr 6.x instance for testing
> again.  Solr cloud will maintain index integrity across the nodes if
> indexed to just one node correct?
> 
> On Tue, Sep 19, 2017 at 12:55 PM, Walter Underwood 
> wrote:
> 
>> Yes, good old HTTP.
>> 
>> wunder
>> Walter Underwood
>> wun...@wunderwood.org
>> http://observer.wunderwood.org/  (my blog)
>> 
>> 
>>> On Sep 19, 2017, at 9:54 AM, David Hastings <
>> hastings.recurs...@gmail.com> wrote:
>>> 
>>> Do you use HttpSolrClient then?
>>> 
>>> On Tue, Sep 19, 2017 at 12:26 PM, Walter Underwood <
>> wun...@wunderwood.org>
>>> wrote:
>>> 
 We run SolrJ 4.7.1 with Solr 6.5.1 (16 node cloud). No problems.
 
 We do not use the cloud-specific client and I’m pretty sure that we
>> don’t
 use ConcurrentUpdateSolrServer. The latter is because it doesn’t report
 errors properly.
 
 We do our indexing through the load balancer and let the Solr Cloud
 cluster get the right docs to the right shards. That runs at 1 million
 docs/minute, so it isn’t worth doing anything fancier.
 
 wunder
 Walter Underwood
 wun...@wunderwood.org
 http://observer.wunderwood.org/  (my blog)
 
 
> On Sep 19, 2017, at 9:05 AM, David Hastings <
 hastings.recurs...@gmail.com> wrote:
> 
> What about the ConcurrentUpdateSolrServer for solrj?  That is what
>> almost
> all of my indexing code is using for solr 5.x, Its been a while since I
> experimented with upgrading but i seem to remember having to go
> to HttpSolrClient and couldnt get the code to compile, so i tabled the
> experiment for a while.  eventually I will need to move to solr 6, but
 if i
> could keep the same indexing code that would be ideal
> 
> On Tue, Sep 19, 2017 at 11:59 AM, Erick Erickson <
 erickerick...@gmail.com>
> wrote:
> 
>> Felix:
>> 
>> There's no specific testing that I know of for this issue, it's "best
>> effort". Which means it _should_ work but I can't make promises.
>> 
>> Now that said, underlying it all is just HTTP requests going back and
>> forth so I know of no a-priori reasons it wouldn't be fine. It's just
>> "try it and see" though.
>> 
>> Best,
>> Erick
>> 
>> I'm probably preaching to the choir, but Java 1.7 is two years past
>> the end of support from Oracle, somebody sometime has to deal with
>> upgrading.
>> 
>> On Mon, Sep 18, 2017 at 10:47 PM, Felix Stanley
>>  wrote:
>>> Hi there,
>>> 
>>> 
>>> 
>>> We are planning to use SOLR J 5.5.4 to query from SOLR 6.5.
>>> 
>>> The reason was that we have to rely on JDK 1.7 at the client and as
>> far
>> as I
>>> know SOLR J 6.x.x only support JDK 1.8.
>>> 
>>> I understood that SOLR J generally maintains backwards/forward
>> compatibility
>>> from this article:
>>> 
>>> 
>>> 
>>> https://wiki.apache.org/solr/Solrj
>>> 
>>> 
>>> 
>>> Would there though be any exception that we need to take caution of
>> for
>> this
>>> specific version?
>>> 
>>> 
>>> 
>>> Thanks a lot.
>>> 
>>> 
>>> 
>>> 
>>> 
>>> Best Regards,
>>> 
>>> 
>>> 
>>> Felix Stanley
>>> 
>>> 
>>> 
>>> 
>>> --
>>> CONFIDENTIALITY NOTICE
>>> 
>>> This e-mail (including any attachments) may contain confidential
>> and/or
>> privileged information. If you are not the intended recipient or have
>> received this e-mail in error, please inform the sender immediately
>> and
>> delete this e-mail (including any attachments) from your computer, and
 you
>> must not use, disclose to anyone else or copy this e-mail (including
>> any
>> attachments), whether in whole or in part.
>>> 
>>> This e-mail and any reply to it may be monitored for security, legal,
>> regulatory compliance and/or other appropriate reasons.
>> 
 
 
>> 
>> 



Re: Sorting by distance resources with WKT polygon data

2017-09-19 Thread David Smiley
Hello,

Sorry for the belated response.

Solr only supports sorting from point or rectangles in the index.  For
rectangles use BBoxField.  For points, ideally use the new
LatLonPointSpatialField; failing that use LatLonType.  You can use RPT for
point data but I don't recommend sorting with it; use one of the others
just mentioned.

~ David

On Tue, Sep 12, 2017 at 5:09 PM Grondin Luc 
wrote:

> Hello,
>
> I am having difficulties with sorting by distance resources indexed with
> WKT geolocation data. I have tried different field configurations and query
> parameters and I did not get working results.
>
> I am using SOLR 6.6 and JTS-core 1.14. My test sample includes resources
> with point coordinates plus one associated with a polygon. I tried using
> both fieldtypes "solr.SpatialRecursivePrefixTreeFieldType" and
> "solr.RptWithGeometrySpatialField". In both cases, I get good results if I
> do not care about sorting. The problem arises when I include sorting.
>
> With SpatialRecursivePrefixTreeFieldType:
>
> The best request I used, based on the documentation I could find, was:
>
> select?fl=*,score&q={!geofilt%20sfield=PositionGeo%20pt=45.52,-73.53%20d=10%20score=distance}&sort=score%20asc
>
> The distance appears to be correctly evaluated for resources indexed with
> point coordinates. However, it is wrong for the resource with a polygon
>
> 
>   2.3913236
>   4.3242383
>   4.671504
>   4.806902
>   20015.115
> 
>
> (Please note that I have verified the polygon externally and it is correct)
>
> With solr.RptWithGeometrySpatialField:
>
> I get an exception triggered by the presence of « score=distance » in the
> request «
> q={!geofilt%20sfield=PositionGeo%20pt=45.52,-73.53%20d=10%20score=distance}
> »
>
> java.lang.UnsupportedOperationException
> at
> org.apache.lucene.spatial.composite.CompositeSpatialStrategy.makeDistanceValueSource(CompositeSpatialStrategy.java:92)
> at
> org.apache.solr.schema.AbstractSpatialFieldType.getValueSourceFromSpatialArgs(AbstractSpatialFieldType.java:412)
> at
> org.apache.solr.schema.AbstractSpatialFieldType.getQueryFromSpatialArgs(AbstractSpatialFieldType.java:359)
> at
> org.apache.solr.schema.AbstractSpatialFieldType.createSpatialQuery(AbstractSpatialFieldType.java:308)
> at
> org.apache.solr.search.SpatialFilterQParser.parse(SpatialFilterQParser.java:80)
>
> From there, I am rather stuck with no ideas on how to resolve these
> problems. So advises in that regards would be much appreciated. I can
> provide more details if necessary.
>
> Thank you in advance,
>
>
>  ---
>   Luc Grondin
>   Analyste en gestion de l'information numérique
>   Centre d'expertise numérique pour la recherche - Université de Montréal
>   téléphone: 514-343-6111 <(514)%20343-6111> p. 3988  --
> luc.gron...@umontreal.ca
>
> --
Lucene/Solr Search Committer, Consultant, Developer, Author, Speaker
LinkedIn: http://linkedin.com/in/davidwsmiley | Book:
http://www.solrenterprisesearchserver.com


Re: Using SOLR J 5.5.4 with SOLR 6.5

2017-09-19 Thread David Hastings
Thanks! Going to have to throw up another solr 6.x instance for testing
again.  Solr cloud will maintain index integrity across the nodes if
indexed to just one node correct?

On Tue, Sep 19, 2017 at 12:55 PM, Walter Underwood 
wrote:

> Yes, good old HTTP.
>
> wunder
> Walter Underwood
> wun...@wunderwood.org
> http://observer.wunderwood.org/  (my blog)
>
>
> > On Sep 19, 2017, at 9:54 AM, David Hastings <
> hastings.recurs...@gmail.com> wrote:
> >
> > Do you use HttpSolrClient then?
> >
> > On Tue, Sep 19, 2017 at 12:26 PM, Walter Underwood <
> wun...@wunderwood.org>
> > wrote:
> >
> >> We run SolrJ 4.7.1 with Solr 6.5.1 (16 node cloud). No problems.
> >>
> >> We do not use the cloud-specific client and I’m pretty sure that we
> don’t
> >> use ConcurrentUpdateSolrServer. The latter is because it doesn’t report
> >> errors properly.
> >>
> >> We do our indexing through the load balancer and let the Solr Cloud
> >> cluster get the right docs to the right shards. That runs at 1 million
> >> docs/minute, so it isn’t worth doing anything fancier.
> >>
> >> wunder
> >> Walter Underwood
> >> wun...@wunderwood.org
> >> http://observer.wunderwood.org/  (my blog)
> >>
> >>
> >>> On Sep 19, 2017, at 9:05 AM, David Hastings <
> >> hastings.recurs...@gmail.com> wrote:
> >>>
> >>> What about the ConcurrentUpdateSolrServer for solrj?  That is what
> almost
> >>> all of my indexing code is using for solr 5.x, Its been a while since I
> >>> experimented with upgrading but i seem to remember having to go
> >>> to HttpSolrClient and couldnt get the code to compile, so i tabled the
> >>> experiment for a while.  eventually I will need to move to solr 6, but
> >> if i
> >>> could keep the same indexing code that would be ideal
> >>>
> >>> On Tue, Sep 19, 2017 at 11:59 AM, Erick Erickson <
> >> erickerick...@gmail.com>
> >>> wrote:
> >>>
>  Felix:
> 
>  There's no specific testing that I know of for this issue, it's "best
>  effort". Which means it _should_ work but I can't make promises.
> 
>  Now that said, underlying it all is just HTTP requests going back and
>  forth so I know of no a-priori reasons it wouldn't be fine. It's just
>  "try it and see" though.
> 
>  Best,
>  Erick
> 
>  I'm probably preaching to the choir, but Java 1.7 is two years past
>  the end of support from Oracle, somebody sometime has to deal with
>  upgrading.
> 
>  On Mon, Sep 18, 2017 at 10:47 PM, Felix Stanley
>   wrote:
> > Hi there,
> >
> >
> >
> > We are planning to use SOLR J 5.5.4 to query from SOLR 6.5.
> >
> > The reason was that we have to rely on JDK 1.7 at the client and as
> far
>  as I
> > know SOLR J 6.x.x only support JDK 1.8.
> >
> > I understood that SOLR J generally maintains backwards/forward
>  compatibility
> > from this article:
> >
> >
> >
> > https://wiki.apache.org/solr/Solrj
> >
> >
> >
> > Would there though be any exception that we need to take caution of
> for
>  this
> > specific version?
> >
> >
> >
> > Thanks a lot.
> >
> >
> >
> >
> >
> > Best Regards,
> >
> >
> >
> > Felix Stanley
> >
> >
> >
> >
> > --
> > CONFIDENTIALITY NOTICE
> >
> > This e-mail (including any attachments) may contain confidential
> and/or
>  privileged information. If you are not the intended recipient or have
>  received this e-mail in error, please inform the sender immediately
> and
>  delete this e-mail (including any attachments) from your computer, and
> >> you
>  must not use, disclose to anyone else or copy this e-mail (including
> any
>  attachments), whether in whole or in part.
> >
> > This e-mail and any reply to it may be monitored for security, legal,
>  regulatory compliance and/or other appropriate reasons.
> 
> >>
> >>
>
>


Re: Using SOLR J 5.5.4 with SOLR 6.5

2017-09-19 Thread Walter Underwood
Yes, good old HTTP.

wunder
Walter Underwood
wun...@wunderwood.org
http://observer.wunderwood.org/  (my blog)


> On Sep 19, 2017, at 9:54 AM, David Hastings  
> wrote:
> 
> Do you use HttpSolrClient then?
> 
> On Tue, Sep 19, 2017 at 12:26 PM, Walter Underwood 
> wrote:
> 
>> We run SolrJ 4.7.1 with Solr 6.5.1 (16 node cloud). No problems.
>> 
>> We do not use the cloud-specific client and I’m pretty sure that we don’t
>> use ConcurrentUpdateSolrServer. The latter is because it doesn’t report
>> errors properly.
>> 
>> We do our indexing through the load balancer and let the Solr Cloud
>> cluster get the right docs to the right shards. That runs at 1 million
>> docs/minute, so it isn’t worth doing anything fancier.
>> 
>> wunder
>> Walter Underwood
>> wun...@wunderwood.org
>> http://observer.wunderwood.org/  (my blog)
>> 
>> 
>>> On Sep 19, 2017, at 9:05 AM, David Hastings <
>> hastings.recurs...@gmail.com> wrote:
>>> 
>>> What about the ConcurrentUpdateSolrServer for solrj?  That is what almost
>>> all of my indexing code is using for solr 5.x, Its been a while since I
>>> experimented with upgrading but i seem to remember having to go
>>> to HttpSolrClient and couldnt get the code to compile, so i tabled the
>>> experiment for a while.  eventually I will need to move to solr 6, but
>> if i
>>> could keep the same indexing code that would be ideal
>>> 
>>> On Tue, Sep 19, 2017 at 11:59 AM, Erick Erickson <
>> erickerick...@gmail.com>
>>> wrote:
>>> 
 Felix:
 
 There's no specific testing that I know of for this issue, it's "best
 effort". Which means it _should_ work but I can't make promises.
 
 Now that said, underlying it all is just HTTP requests going back and
 forth so I know of no a-priori reasons it wouldn't be fine. It's just
 "try it and see" though.
 
 Best,
 Erick
 
 I'm probably preaching to the choir, but Java 1.7 is two years past
 the end of support from Oracle, somebody sometime has to deal with
 upgrading.
 
 On Mon, Sep 18, 2017 at 10:47 PM, Felix Stanley
  wrote:
> Hi there,
> 
> 
> 
> We are planning to use SOLR J 5.5.4 to query from SOLR 6.5.
> 
> The reason was that we have to rely on JDK 1.7 at the client and as far
 as I
> know SOLR J 6.x.x only support JDK 1.8.
> 
> I understood that SOLR J generally maintains backwards/forward
 compatibility
> from this article:
> 
> 
> 
> https://wiki.apache.org/solr/Solrj
> 
> 
> 
> Would there though be any exception that we need to take caution of for
 this
> specific version?
> 
> 
> 
> Thanks a lot.
> 
> 
> 
> 
> 
> Best Regards,
> 
> 
> 
> Felix Stanley
> 
> 
> 
> 
> --
> CONFIDENTIALITY NOTICE
> 
> This e-mail (including any attachments) may contain confidential and/or
 privileged information. If you are not the intended recipient or have
 received this e-mail in error, please inform the sender immediately and
 delete this e-mail (including any attachments) from your computer, and
>> you
 must not use, disclose to anyone else or copy this e-mail (including any
 attachments), whether in whole or in part.
> 
> This e-mail and any reply to it may be monitored for security, legal,
 regulatory compliance and/or other appropriate reasons.
 
>> 
>> 



Re: Solr returning same object in different page

2017-09-19 Thread alessandro.benedetti
Which version of Solr are you on?
Are you using SolrCloud or any distributed search?
In that case, I think( as already mentioned by Shawn) this could be related
[1] .

if it is just plain Solr, my shot in the dark is your boost function :

{!boost+b=recip(ms(NOW,field1),3.16e-11,1,1)}{!boost+b=recip(ms(NOW,field2),3.16e-11,1,1)}
 

I see you use NOW ( which changes continuosly).
it is normally suggested to round it ( for example NOW/HOUR or NOW/DAY).
The rounding granularity depends on the use case.

Time passing should not bring any change in ranking ( but it brings change
in the score).
I can imagine that if for any reason of rounding the score, we end up in
having different documents with the same score, then the internal ordinal
will be used for ranking them, bringing slightly different rankings.
This is very unlikely, but if we are using a single Solr, it's the first
thing that jumps to my mind.

[1] https://issues.apache.org/jira/browse/SOLR-5821
[2] https://github.com/fguery/lucene-solr/tree/replicaChoice




-
---
Alessandro Benedetti
Search Consultant, R&D Software Engineer, Director
Sease Ltd. - www.sease.io
--
Sent from: http://lucene.472066.n3.nabble.com/Solr-User-f472068.html


Re: Using SOLR J 5.5.4 with SOLR 6.5

2017-09-19 Thread David Hastings
Do you use HttpSolrClient then?

On Tue, Sep 19, 2017 at 12:26 PM, Walter Underwood 
wrote:

> We run SolrJ 4.7.1 with Solr 6.5.1 (16 node cloud). No problems.
>
> We do not use the cloud-specific client and I’m pretty sure that we don’t
> use ConcurrentUpdateSolrServer. The latter is because it doesn’t report
> errors properly.
>
> We do our indexing through the load balancer and let the Solr Cloud
> cluster get the right docs to the right shards. That runs at 1 million
> docs/minute, so it isn’t worth doing anything fancier.
>
> wunder
> Walter Underwood
> wun...@wunderwood.org
> http://observer.wunderwood.org/  (my blog)
>
>
> > On Sep 19, 2017, at 9:05 AM, David Hastings <
> hastings.recurs...@gmail.com> wrote:
> >
> > What about the ConcurrentUpdateSolrServer for solrj?  That is what almost
> > all of my indexing code is using for solr 5.x, Its been a while since I
> > experimented with upgrading but i seem to remember having to go
> > to HttpSolrClient and couldnt get the code to compile, so i tabled the
> > experiment for a while.  eventually I will need to move to solr 6, but
> if i
> > could keep the same indexing code that would be ideal
> >
> > On Tue, Sep 19, 2017 at 11:59 AM, Erick Erickson <
> erickerick...@gmail.com>
> > wrote:
> >
> >> Felix:
> >>
> >> There's no specific testing that I know of for this issue, it's "best
> >> effort". Which means it _should_ work but I can't make promises.
> >>
> >> Now that said, underlying it all is just HTTP requests going back and
> >> forth so I know of no a-priori reasons it wouldn't be fine. It's just
> >> "try it and see" though.
> >>
> >> Best,
> >> Erick
> >>
> >> I'm probably preaching to the choir, but Java 1.7 is two years past
> >> the end of support from Oracle, somebody sometime has to deal with
> >> upgrading.
> >>
> >> On Mon, Sep 18, 2017 at 10:47 PM, Felix Stanley
> >>  wrote:
> >>> Hi there,
> >>>
> >>>
> >>>
> >>> We are planning to use SOLR J 5.5.4 to query from SOLR 6.5.
> >>>
> >>> The reason was that we have to rely on JDK 1.7 at the client and as far
> >> as I
> >>> know SOLR J 6.x.x only support JDK 1.8.
> >>>
> >>> I understood that SOLR J generally maintains backwards/forward
> >> compatibility
> >>> from this article:
> >>>
> >>>
> >>>
> >>> https://wiki.apache.org/solr/Solrj
> >>>
> >>>
> >>>
> >>> Would there though be any exception that we need to take caution of for
> >> this
> >>> specific version?
> >>>
> >>>
> >>>
> >>> Thanks a lot.
> >>>
> >>>
> >>>
> >>>
> >>>
> >>> Best Regards,
> >>>
> >>>
> >>>
> >>> Felix Stanley
> >>>
> >>>
> >>>
> >>>
> >>> --
> >>> CONFIDENTIALITY NOTICE
> >>>
> >>> This e-mail (including any attachments) may contain confidential and/or
> >> privileged information. If you are not the intended recipient or have
> >> received this e-mail in error, please inform the sender immediately and
> >> delete this e-mail (including any attachments) from your computer, and
> you
> >> must not use, disclose to anyone else or copy this e-mail (including any
> >> attachments), whether in whole or in part.
> >>>
> >>> This e-mail and any reply to it may be monitored for security, legal,
> >> regulatory compliance and/or other appropriate reasons.
> >>
>
>


Re: [bulk]: Dates and DataImportHandler

2017-09-19 Thread Jamie Jackson
FWIW, I know mine worked, so maybe try:



I can't conceive of what the locale would possibly do when a dateFormat is
specified, so I omitted the attribute. (Maybe one can specify dateFormat
*or *locale--it seems like specifying both would cause a clash.) For what
it's worth, the format you're trying to write seems identical to the
default*, so I'm not sure what benefit you're getting by using that
propertyWriter.

*It's identical to *my* default, anyway. Maybe the default changes based on
one's system configuration, I don't know. This stuff isn't very well
documented.

On Tue, Sep 19, 2017 at 7:22 AM, Mannott, Birgit 
wrote:

> Hi,
>
> I have a similar problem. I try to change the timezone for the
> last_index_time by setting
>
>  type="SimplePropertiesWriter" locale="en_US" />
>
> in the  section of my data-config.xml file.
>
> But when doing this I always get a NullPointerException on Delta Import:
>
> 2017-09-15 14:04:00.825 INFO  (Thread-2938) [   x:mex_prd_dev1100-ap]
> o.a.s.h.d.DataImporter Starting Delta Import
> 2017-09-15 14:04:00.827 ERROR (Thread-2938) [   x:mex_prd_dev1100-ap]
> o.a.s.h.d.DataImporter Delta Import Failed
> org.apache.solr.handler.dataimport.DataImportHandlerException: Unable to
> PropertyWriter implementation:SimplePropertiesWriter
> at org.apache.solr.handler.dataimport.DataImporter.
> createPropertyWriter(DataImporter.java:330)
> at org.apache.solr.handler.dataimport.DataImporter.
> doDeltaImport(DataImporter.java:439)
> at org.apache.solr.handler.dataimport.DataImporter.
> runCmd(DataImporter.java:476)
> at org.apache.solr.handler.dataimport.DataImporter.
> lambda$runAsync$0(DataImporter.java:457)
> at java.lang.Thread.run(Thread.java:745)
> Caused by: java.lang.NullPointerException
> at java.text.SimpleDateFormat.(SimpleDateFormat.java:
> 598)
> at org.apache.solr.handler.dataimport.
> SimplePropertiesWriter.init(SimplePropertiesWriter.java:100)
> at org.apache.solr.handler.dataimport.DataImporter.
> createPropertyWriter(DataImporter.java:328)
> ... 4 more
>
> Has anyone an idea what is wrong or missing?
>
> Thanks,
> Birgit
>
>
>
> -Original Message-
> From: Jamie Jackson [mailto:jamieja...@gmail.com]
> Sent: Tuesday, September 19, 2017 3:42 AM
> To: solr-user@lucene.apache.org
> Subject: [bulk]: Dates and DataImportHandler
>
> Hi folks,
>
> My DB server is on America/Chicago time. Solr (on Docker) is running on
> UTC. Dates coming from my (MariaDB) data source seem to get translated
> properly into the Solr index without me doing anything special.
>
> However when doing delta imports using last_index_time (
> http://wiki.apache.org/solr/DataImportHandlerDeltaQueryViaFullImport ), I
> can't seem to get the date, which Solr provides, to be understood by the DB
> as being UTC (and translated back, accordingly). In other words, the DB
> thinks the Solr UTC date is local, so it thinks the date is ahead by six
> hours.
>
> '${dataimporter.request.clean}' != 'false'
>
> or dt > '${dataimporter.last_index_time}'
>
> I came up with this workaround, which seems to work:
>
> '${dataimporter.request.clean}' != 'false'
>
> /* ${user.timezone} is UTC, and the ${custom.dataimporter.datasource.tz}
> property is set to America/Chicago */
>
> or dt > CONVERT_TZ('${dataimporter.last_index_time}','${user.
> timezone}','${
> custom.dataimporter.datasource.tz}')
>
> However, isn't there a way for this translation to happen more naturally?
>
> I thought maybe I could do something like this:
>
> 
> dateFormat="-MM-dd HH:mm:ssZ"
>
> type="SimplePropertiesWriter"
>
> />
>
> The above did set the property as expected (with a trailiing `+`), but
> that didn't seem to help the DB understand/translate the date.
>
> Thanks,
> Jamie
>


Re: Solr nodes crashing (OOM) after 6.6 upgrade

2017-09-19 Thread shamik
Emir, after digging deeper into the logs (using new relic/solr admin) during
the outage, it looks like a combination of query load and indexing process
triggered it. Based on the earlier pattern, memory would tend to increase at
a steady pace, but then surge all of a sudden, triggering OOM. After I
scaled down the heap size as per Walter's suggestion, the memory seemed to
have been holding up. But there's a possibility the lower heap size might
have restricted the GC to utilize higher CPU. The cache size has been scaled
down, I'm hoping it's no longer adding an overhead after every commit.

I've facet.limit=-1 configured for few search types, but facet.mincount is
always set as 1. Didn't know that's detrimental to doc values.



--
Sent from: http://lucene.472066.n3.nabble.com/Solr-User-f472068.html


Re: Using SOLR J 5.5.4 with SOLR 6.5

2017-09-19 Thread Walter Underwood
We run SolrJ 4.7.1 with Solr 6.5.1 (16 node cloud). No problems.

We do not use the cloud-specific client and I’m pretty sure that we don’t use 
ConcurrentUpdateSolrServer. The latter is because it doesn’t report errors 
properly.

We do our indexing through the load balancer and let the Solr Cloud cluster get 
the right docs to the right shards. That runs at 1 million docs/minute, so it 
isn’t worth doing anything fancier.

wunder
Walter Underwood
wun...@wunderwood.org
http://observer.wunderwood.org/  (my blog)


> On Sep 19, 2017, at 9:05 AM, David Hastings  
> wrote:
> 
> What about the ConcurrentUpdateSolrServer for solrj?  That is what almost
> all of my indexing code is using for solr 5.x, Its been a while since I
> experimented with upgrading but i seem to remember having to go
> to HttpSolrClient and couldnt get the code to compile, so i tabled the
> experiment for a while.  eventually I will need to move to solr 6, but if i
> could keep the same indexing code that would be ideal
> 
> On Tue, Sep 19, 2017 at 11:59 AM, Erick Erickson 
> wrote:
> 
>> Felix:
>> 
>> There's no specific testing that I know of for this issue, it's "best
>> effort". Which means it _should_ work but I can't make promises.
>> 
>> Now that said, underlying it all is just HTTP requests going back and
>> forth so I know of no a-priori reasons it wouldn't be fine. It's just
>> "try it and see" though.
>> 
>> Best,
>> Erick
>> 
>> I'm probably preaching to the choir, but Java 1.7 is two years past
>> the end of support from Oracle, somebody sometime has to deal with
>> upgrading.
>> 
>> On Mon, Sep 18, 2017 at 10:47 PM, Felix Stanley
>>  wrote:
>>> Hi there,
>>> 
>>> 
>>> 
>>> We are planning to use SOLR J 5.5.4 to query from SOLR 6.5.
>>> 
>>> The reason was that we have to rely on JDK 1.7 at the client and as far
>> as I
>>> know SOLR J 6.x.x only support JDK 1.8.
>>> 
>>> I understood that SOLR J generally maintains backwards/forward
>> compatibility
>>> from this article:
>>> 
>>> 
>>> 
>>> https://wiki.apache.org/solr/Solrj
>>> 
>>> 
>>> 
>>> Would there though be any exception that we need to take caution of for
>> this
>>> specific version?
>>> 
>>> 
>>> 
>>> Thanks a lot.
>>> 
>>> 
>>> 
>>> 
>>> 
>>> Best Regards,
>>> 
>>> 
>>> 
>>> Felix Stanley
>>> 
>>> 
>>> 
>>> 
>>> --
>>> CONFIDENTIALITY NOTICE
>>> 
>>> This e-mail (including any attachments) may contain confidential and/or
>> privileged information. If you are not the intended recipient or have
>> received this e-mail in error, please inform the sender immediately and
>> delete this e-mail (including any attachments) from your computer, and you
>> must not use, disclose to anyone else or copy this e-mail (including any
>> attachments), whether in whole or in part.
>>> 
>>> This e-mail and any reply to it may be monitored for security, legal,
>> regulatory compliance and/or other appropriate reasons.
>> 



TermVectors and ExactStatsCache

2017-09-19 Thread Patrick Plante
Hi!

I have a SolrCloud 6.6 collection with 3 shards setup where I need the 
TermVectors TF and DF values for queries.

I have configured the ExactStatsCache in the solrConfig:



When I query "detector works", it returns different docfreq values based on the 
shard the document comes from:

"termVectors":[
"27504103",[
  "uniqueKey","27504103",
  "kc",[
"detector works",[
  "tf",1,
  "df",3,
  "tf-idf",0.]]],
"27507925",[
  "uniqueKey","27507925",
  "kc",[
"detector works",[
  "tf",1,
  "df",3,
  "tf-idf",0.]]],
"27504105",[
  "uniqueKey","27504105",
  "kc",[
"detector works",[
  "tf",1,
  "df",2,
  "tf-idf",0.5]]],
"27507927",[
  "uniqueKey","27507927",
  "kc",[
"detector works",[
  "tf",1,
  "df",2,
  "tf-idf",0.5]]],
"27507929",[
  "uniqueKey","27507929",
  "kc",[
"detector works",[
  "tf",1,
  "df",1,
  "tf-idf",1.0]]],
"27504107",[
  "uniqueKey","27504107",
  "kc",[
"detector works",[
  "tf",1,
  "df",3,
  "tf-idf",0.}

I expect to see the DF values to be 6 and TF-IDF to be adjusted on that value. 
I can see in the debug logs that the cache was active.

I have found a pending bug (since Solr 5.5: 
https://issues.apache.org/jira/browse/SOLR-8893) that explains that this 
ExactStatsCache is used to compute the correct TF-IDF for the query but not for 
the TermVectors component.

Is there any way to get the correctly merged DF values (and TF-IDF) from 
multiple shards?

Is there a way to get from which shard a document comes from so I could compute 
my own correct DF?

Thank you,
Patrick



Re: Using SOLR J 5.5.4 with SOLR 6.5

2017-09-19 Thread David Hastings
What about the ConcurrentUpdateSolrServer for solrj?  That is what almost
all of my indexing code is using for solr 5.x, Its been a while since I
experimented with upgrading but i seem to remember having to go
to HttpSolrClient and couldnt get the code to compile, so i tabled the
experiment for a while.  eventually I will need to move to solr 6, but if i
could keep the same indexing code that would be ideal

On Tue, Sep 19, 2017 at 11:59 AM, Erick Erickson 
wrote:

> Felix:
>
> There's no specific testing that I know of for this issue, it's "best
> effort". Which means it _should_ work but I can't make promises.
>
> Now that said, underlying it all is just HTTP requests going back and
> forth so I know of no a-priori reasons it wouldn't be fine. It's just
> "try it and see" though.
>
> Best,
> Erick
>
> I'm probably preaching to the choir, but Java 1.7 is two years past
> the end of support from Oracle, somebody sometime has to deal with
> upgrading.
>
> On Mon, Sep 18, 2017 at 10:47 PM, Felix Stanley
>  wrote:
> > Hi there,
> >
> >
> >
> > We are planning to use SOLR J 5.5.4 to query from SOLR 6.5.
> >
> > The reason was that we have to rely on JDK 1.7 at the client and as far
> as I
> > know SOLR J 6.x.x only support JDK 1.8.
> >
> > I understood that SOLR J generally maintains backwards/forward
> compatibility
> > from this article:
> >
> >
> >
> > https://wiki.apache.org/solr/Solrj
> >
> >
> >
> > Would there though be any exception that we need to take caution of for
> this
> > specific version?
> >
> >
> >
> > Thanks a lot.
> >
> >
> >
> >
> >
> > Best Regards,
> >
> >
> >
> > Felix Stanley
> >
> >
> >
> >
> > --
> > CONFIDENTIALITY NOTICE
> >
> > This e-mail (including any attachments) may contain confidential and/or
> privileged information. If you are not the intended recipient or have
> received this e-mail in error, please inform the sender immediately and
> delete this e-mail (including any attachments) from your computer, and you
> must not use, disclose to anyone else or copy this e-mail (including any
> attachments), whether in whole or in part.
> >
> > This e-mail and any reply to it may be monitored for security, legal,
> regulatory compliance and/or other appropriate reasons.
>


no search results for specific search in solr 6.6.0

2017-09-19 Thread Sascha Tuschinski
Hello Community,

We are using a Solr Core with Solr 6.6.0 on Windows 10 (latest updates) with 
field names defined like "f_1179014266_txt". The number in the middle of the 
name differs for each field we use. For language specific fields we are adding 
an language specific extension e.g. "f_1179014267_txt_fr", 
"f_1179014268_txt_de", "f_1179014269_txt_en" and so on.
We are having the following odd issue within the french "_fr" field only:
Field
f_1197829835_txt_fr
Dynamic Field /
*_txt_fr
Type
text_fr

  *   The saved value which had been added with no problem to the Solr index is 
"FRaoo".
  *   When searching within the Solr query tool for 
"f_1197829839_txt_fr:*FRao*" it returns the items matching the term as seen 
below - OK.
{
  "responseHeader":{
"status":0,
"QTime":1,
"params":{
  "q":"f_1197829839_txt_fr:*FRao*",
  "indent":"on",
  "wt":"json",
  "_":"1505808887827"}},
  "response":{"numFound":1,"start":0,"docs":[
  {
"id":"129",
"f_1197829834_txt_en":"EnAir",
"f_1197829822_txt_de":"Lufti",
"f_1197829835_txt_fr":"FRaoi",
"f_1197829836_txt_it":"ITAir",
"f_1197829799_txt":["Lufti"],
"f_1197829838_txt_en":"EnAir",
"f_1197829839_txt_fr":"FRaoo",
"f_1197829840_txt_it":"ITAir",
"_version_":1578520424165146624}]
  }}

  *   When searching for "f_1197829839_txt_fr:*FRaoo*" NO item is found - Wrong!
{
  "responseHeader":{
"status":0,
"QTime":1,
"params":{
  "q":"f_1197829839_txt_fr:*FRaoo*",
  "indent":"on",
  "wt":"json",
  "_":"1505808887827"}},
  "response":{"numFound":0,"start":0,"docs":[]
  }}
When searching for "f_1197829839_txt_fr:FRaoo" (no wildcards) the matching 
items are found - OK

{
  "responseHeader":{
"status":0,
"QTime":1,
"params":{
  "q":"f_1197829839_txt_fr:FRaoo",
  "indent":"on",
  "wt":"json",
  "_":"1505808887827"}},
  "response":{"numFound":1,"start":0,"docs":[
  {
"id":"129",
"f_1197829834_txt_en":"EnAir",
"f_1197829822_txt_de":"Lufti",
"f_1197829835_txt_fr":"FRaoi",
"f_1197829836_txt_it":"ITAir",
"f_1197829799_txt":["Lufti"],
"f_1197829838_txt_en":"EnAir",
"f_1197829839_txt_fr":"FRaoo",
"f_1197829840_txt_it":"ITAir",
"_version_":1578520424165146624}]
  }}
If we save exact the same value into a different language field e.g. ending on 
"_en", means "f_1197829834_txt_en", then the search 
"f_1197829834_txt_en:*FRaoo*" find all items correctly!
We have no idea what's wrong here and we even recreated the index and can 
reproduce this problem all the time. I can only see that the value starts with 
"FR" and the field extension ends with "fr" but this is not problem for "en", 
"de" an so on. All fields are used in the same way and have the same field 
properties.
Any help or ideas are highly appreciated. I filed a bug for this 
https://issues.apache.org/jira/browse/SOLR-11367 but had been asked to publish 
my question here. Thanks for reading.
Greetings,
___
Sascha Tuschinski
Manager Quality Assurance // Canto GmbH
Phone: +49 (0) 30 ­ 390 485 - 41
E-mail: stuschin...@canto.com
Web: canto.com

Canto GmbH
Lietzenburger Str. 46
10789 Berlin
Phone: +49 (0)30 390485-0
Fax: +49 (0)30 390485-55
Amtsgericht Berlin-Charlottenburg HRB 88566
Geschäftsführer: Jack McGannon, Thomas Mockenhaupt



Re: Using SOLR J 5.5.4 with SOLR 6.5

2017-09-19 Thread Erick Erickson
Felix:

There's no specific testing that I know of for this issue, it's "best
effort". Which means it _should_ work but I can't make promises.

Now that said, underlying it all is just HTTP requests going back and
forth so I know of no a-priori reasons it wouldn't be fine. It's just
"try it and see" though.

Best,
Erick

I'm probably preaching to the choir, but Java 1.7 is two years past
the end of support from Oracle, somebody sometime has to deal with
upgrading.

On Mon, Sep 18, 2017 at 10:47 PM, Felix Stanley
 wrote:
> Hi there,
>
>
>
> We are planning to use SOLR J 5.5.4 to query from SOLR 6.5.
>
> The reason was that we have to rely on JDK 1.7 at the client and as far as I
> know SOLR J 6.x.x only support JDK 1.8.
>
> I understood that SOLR J generally maintains backwards/forward compatibility
> from this article:
>
>
>
> https://wiki.apache.org/solr/Solrj
>
>
>
> Would there though be any exception that we need to take caution of for this
> specific version?
>
>
>
> Thanks a lot.
>
>
>
>
>
> Best Regards,
>
>
>
> Felix Stanley
>
>
>
>
> --
> CONFIDENTIALITY NOTICE
>
> This e-mail (including any attachments) may contain confidential and/or 
> privileged information. If you are not the intended recipient or have 
> received this e-mail in error, please inform the sender immediately and 
> delete this e-mail (including any attachments) from your computer, and you 
> must not use, disclose to anyone else or copy this e-mail (including any 
> attachments), whether in whole or in part.
>
> This e-mail and any reply to it may be monitored for security, legal, 
> regulatory compliance and/or other appropriate reasons.


Re: Using SOLR J 5.5.4 with SOLR 6.5

2017-09-19 Thread Shawn Heisey
On 9/18/2017 11:47 PM, Felix Stanley wrote:
> We are planning to use SOLR J 5.5.4 to query from SOLR 6.5.
>
> The reason was that we have to rely on JDK 1.7 at the client and as far as I
> know SOLR J 6.x.x only support JDK 1.8.
>
> I understood that SOLR J generally maintains backwards/forward compatibility
> from this article:

As long as you're not accessing a SolrCloud cluster with
CloudSolrClient, that should work well.  HttpSolrClient tends to work
well across a fairly wide version discrepancy -- the HTTP API changes
infrequently and maintains backward compatibility very well.

With the cloud client, such a large a version gap is more likely to run
into problems, because SolrCloud changes very quickly from release to
release, and the cloud client is *tightly* integrated with how the
server functions.

Thank,
Shawn



RE: Zookeeper credentials are showed up on the Solr Admin GUI

2017-09-19 Thread Pekhov, Ivan (NIH/NLM/NCBI) [C]
Hi Susheel,

Thank you so much for so quick response! I've created the issue as you 
requested, please refer to the link:

https://issues.apache.org/jira/browse/SOLR-11369

Thank you!
Ivan

-Original Message-
From: Susheel Kumar [mailto:susheel2...@gmail.com] 
Sent: Tuesday, September 19, 2017 11:29 AM
To: solr-user@lucene.apache.org
Subject: Re: Zookeeper credentials are showed up on the Solr Admin GUI

Hi Ivan, Can you please submit a JIRA/bug report for this at 
https://issues.apache.org/jira/projects/SOLR

Thanks,
Susheel

On Tue, Sep 19, 2017 at 11:12 AM, Pekhov, Ivan (NIH/NLM/NCBI) [C] < 
ivan.pek...@nih.gov> wrote:

> Hello Guys,
>
> We've been noticing this problem with Solr version 5.4.1 and it's 
> still the case for the version 6.6.0. The problem is that we're using 
> SolrCloud with secured Zookeeper and our users are granted access to 
> Solr Admin GUI, and, at the same time, they are not supposed to have 
> access to Zookeeper credentials, i.e. usernames and passwords. 
> However, we (and some of our
> users) have found out that Zookeeper credentials are displayed on at 
> least two sections of the Solr Admin GUI, i.e. "Dashboard" and "Java 
> Properties".
>
> Having taken a look at the JavaScript code that runs behind the scenes 
> for those pages, we can see that the sensitive parameters ( 
> -DzkDigestPassword, -DzkDigestReadonlyPassword, 
> -DzkDigestReadonlyUsername, -DzkDigestUsername
> ) are fetched via AJAX from the following two URL paths:
>
> /solr/admin/info/system
> /solr/admin/info/properties
>
> Could you please consider for the future Solr releases removing the 
> Zookeeper parameters mentioned above from the output of these URLs and 
> from other URLs that contain this information in their output, if 
> there are any besides the ones mentioned? We find that it is be pretty 
> challenging (and probably impossible) to restrict users from accessing 
> some particular paths with security.json mechanism, and we think that 
> that would be beneficial for overall Solr security to hide Zookeeper 
> credentials.
>
> Thank you so much for your consideration!
>
> Best regards,
> Ivan Pekhov
>
>


Re: Zookeeper credentials are showed up on the Solr Admin GUI

2017-09-19 Thread Susheel Kumar
Hi Ivan, Can you please submit a JIRA/bug report for this at
https://issues.apache.org/jira/projects/SOLR

Thanks,
Susheel

On Tue, Sep 19, 2017 at 11:12 AM, Pekhov, Ivan (NIH/NLM/NCBI) [C] <
ivan.pek...@nih.gov> wrote:

> Hello Guys,
>
> We've been noticing this problem with Solr version 5.4.1 and it's still
> the case for the version 6.6.0. The problem is that we're using SolrCloud
> with secured Zookeeper and our users are granted access to Solr Admin GUI,
> and, at the same time, they are not supposed to have access to Zookeeper
> credentials, i.e. usernames and passwords. However, we (and some of our
> users) have found out that Zookeeper credentials are displayed on at least
> two sections of the Solr Admin GUI, i.e. "Dashboard" and "Java Properties".
>
> Having taken a look at the JavaScript code that runs behind the scenes for
> those pages, we can see that the sensitive parameters ( -DzkDigestPassword,
> -DzkDigestReadonlyPassword, -DzkDigestReadonlyUsername, -DzkDigestUsername
> ) are fetched via AJAX from the following two URL paths:
>
> /solr/admin/info/system
> /solr/admin/info/properties
>
> Could you please consider for the future Solr releases removing the
> Zookeeper parameters mentioned above from the output of these URLs and from
> other URLs that contain this information in their output, if there are any
> besides the ones mentioned? We find that it is be pretty challenging (and
> probably impossible) to restrict users from accessing some particular paths
> with security.json mechanism, and we think that that would be beneficial
> for overall Solr security to hide Zookeeper credentials.
>
> Thank you so much for your consideration!
>
> Best regards,
> Ivan Pekhov
>
>


Zookeeper credentials are showed up on the Solr Admin GUI

2017-09-19 Thread Pekhov, Ivan (NIH/NLM/NCBI) [C]
Hello Guys,

We've been noticing this problem with Solr version 5.4.1 and it's still the 
case for the version 6.6.0. The problem is that we're using SolrCloud with 
secured Zookeeper and our users are granted access to Solr Admin GUI, and, at 
the same time, they are not supposed to have access to Zookeeper credentials, 
i.e. usernames and passwords. However, we (and some of our users) have found 
out that Zookeeper credentials are displayed on at least two sections of the 
Solr Admin GUI, i.e. "Dashboard" and "Java Properties".

Having taken a look at the JavaScript code that runs behind the scenes for 
those pages, we can see that the sensitive parameters ( -DzkDigestPassword, 
-DzkDigestReadonlyPassword, -DzkDigestReadonlyUsername, -DzkDigestUsername ) 
are fetched via AJAX from the following two URL paths:

/solr/admin/info/system
/solr/admin/info/properties

Could you please consider for the future Solr releases removing the Zookeeper 
parameters mentioned above from the output of these URLs and from other URLs 
that contain this information in their output, if there are any besides the 
ones mentioned? We find that it is be pretty challenging (and probably 
impossible) to restrict users from accessing some particular paths with 
security.json mechanism, and we think that that would be beneficial for overall 
Solr security to hide Zookeeper credentials.

Thank you so much for your consideration!

Best regards,
Ivan Pekhov



Re: Solr nodes crashing (OOM) after 6.6 upgrade

2017-09-19 Thread Walter Underwood
With frequent commits, autowarming isn’t very useful. Even with a daily bulk 
update, I use explicit warming queries.

For our textbooks collection, I configure the twenty top queries and the twenty 
most common words in the index. Neither list changes much. If we used facets, 
I’d warm those, too.

wunder
Walter Underwood
wun...@wunderwood.org
http://observer.wunderwood.org/  (my blog)


> On Sep 19, 2017, at 12:18 AM, Toke Eskildsen  wrote:
> 
> On Mon, 2017-09-18 at 20:47 -0700, shamik wrote:
>> I did bring down the heap size to 8gb, changed to G1 and reduced the
>> cache params. The memory so far has been holding up but will wait for
>> a while before passing on a judgment. 
> 
> Sounds reasonable.
> 
>> > autowarmCount="0"/>
> [...]
> 
>> The change seemed to have increased the number of slow queries (1000
>> ms), but I'm willing to address the OOM over performance at this
>> point.
> 
> You over-compensated by switching from an enormous cache with excessive
> warming to a small cache with no warming. Try setting autowarmCount to
> 20 or something like that. Also make an explicit warming query that
> facets on all your facet-fields, to initialize the underlying
> structures.
> 
>> One thing I realized is that I provided the wrong index size here.
>> It's 49gb instead of 25, which I mistakenly picked from one shard.
> 
> Quite independent from all of this, your index is not a large one; it
> might work better for you to store it as a single shard (with
> replicas), to avoid the overhead of the distributes processing needed
> for multi-shard. The overhead is especially visible when doing a lot of
> String faceting.
> 
>> I hope the heap size will continue to sustain for the index size. 
> 
> You can check the memory usage in the admin GUI.
> 
> - Toke Eskildsen, Royal Danish Library
> 



Re: Installation help

2017-09-19 Thread john999
I have been worked on solr cloud before 3-4 years. I worked day and night as
at that time i was not a linux guy. After lots of nightmare reading I
installed solr cloud with external zookeeper.Find here  "Solr Cloud
Installation with External Zookeeper"
  



--
Sent from: http://lucene.472066.n3.nabble.com/Solr-User-f472068.html


RE: How to remove control characters in stored value at Solr side

2017-09-19 Thread Markus Jelsma
Ah, thanks!

 
 
-Original message-
> From:Chris Hostetter 
> Sent: Monday 18th September 2017 23:11
> To: solr-user@lucene.apache.org
> Subject: RE: How to remove control characters in stored value at Solr side
> 
> 
> : But, can you then explain why Apache Nutch with SolrJ had this problem? 
> : It seems that by default SolrJ does use XML as transport format. We have 
> : always used SolrJ which i assumed would default to javabin, but we had 
> : this exact problem anyway, and solved it by stripping non-character code 
> : points.
> : 
> : When we use SolrJ for querying we clearly see wt=javabin in the logs, 
> : but updates showed the problem. Can we fix it anywhere?
> 
> wt=javabin indicates what *response* format the client (ie: solrj) is 
> requesting from the server ... the format used for the *request* body is 
> determined by the client based on the Content-Type of the ContentStream 
> it sends to Solr.
> 
> When using SolrJ, and sending an arbitrary/abstract SolrRequest objects, 
> the "RequestWriter" configured on the SolrClient is what specifies the 
> Content-Type to use (and is in charge of serializing the java objects 
> appropriately)
> 
> BinaryRequestWriter (which uses javabin format to serialize SolrRequest 
> objects when building ContentStreams) has been the default since Solr 
> 5.5/6.0 (see SOLR-8595)
> 
> 
> -Hoss
> http://www.lucidworks.com/
> 


ClassCastException in RelevanceComparator

2017-09-19 Thread Dmitry Kan
Hi,

Solr: 4.10.2
Schema has two fields of TrieIntField type.








Simple match all query with a cursorMark produces the following exception:

2017-09-19 11:52:53.684 [qtp1722023916-7992] ERROR
org.apache.solr.servlet.SolrDispatchFilter  - null:java.lang.ClassCast
Exception: java.lang.Long cannot be cast to java.lang.Float

at
org.apache.lucene.search.FieldComparator$RelevanceComparator.setTopValue(FieldComparator.java:758)

at
org.apache.lucene.search.TopFieldCollector$PagingFieldCollector.(TopFieldCollector.java:877)

at
org.apache.lucene.search.TopFieldCollector.create(TopFieldCollector.java:1183)

at
org.apache.solr.search.SolrIndexSearcher.buildTopDocsCollector(SolrIndexSearcher.java:1537)

at
org.apache.solr.search.SolrIndexSearcher.getDocListNC(SolrIndexSearcher.java:1617)

at
org.apache.solr.search.SolrIndexSearcher.getDocListC(SolrIndexSearcher.java:1433)

at
org.apache.solr.search.SolrIndexSearcher.search(SolrIndexSearcher.java:514)

at
org.apache.solr.handler.component.QueryComponent.process(QueryComponent.java:485)

at
com.alphasense.solr.search.handler.AsQueryComponent.process(AsQueryComponent.java:28)

at
org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:218)

at
org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:135)

at org.apache.solr.core.SolrCore.execute(SolrCore.java:1967)

at
org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:777)

at
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:418)

at
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:207)

at
org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1486)

at
org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:503)

at
org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:138)

at
org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:564)

at
org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:213)

at
org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1094)

at
org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:432)

at
org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:175)

at
org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1028)

at
org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:136)

at
org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:258)

at
org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerCollection.java:109)

at
org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:97)

at
org.eclipse.jetty.rewrite.handler.RewriteHandler.handle(RewriteHandler.java:317)

at
org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:97)

at org.eclipse.jetty.server.Server.handle(Server.java:445)

at org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.java:267)

at
org.eclipse.jetty.server.HttpConnection.onFillable(HttpConnection.java:224)

at
org.eclipse.jetty.io.AbstractConnection$ReadCallback.run(AbstractConnection.java:358)

at
org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:601)

at
org.eclipse.jetty.util.thread.QueuedThreadPool$3.run(QueuedThreadPool.java:532)

at java.lang.Thread.run(Thread.java:745)


Would tint fields be causing this? If so, should they be defined as Floats?

Thanks,

Dmitry

-- 
Dmitry Kan
Luke Toolbox: http://github.com/DmitryKey/luke
Blog: http://dmitrykan.blogspot.com
Twitter: http://twitter.com/dmitrykan
SemanticAnalyzer: https://semanticanalyzer.info


Re: Search by similarity?

2017-09-19 Thread alessandro.benedetti
In addition to that, I still believe More Like This is a better option for
you.
The reason is that the MLT is able to evaluate the interesting terms from
your document (title is the only field of interest for you), and boost them
accordingly.

Related your "80% of similarity", this is more tricky.
You can potentially calculate the score of the identical document and then
render the score of the similar ones normalised based on that.

Normally it's useless to show the score value per se, but in the case of MLT
it actually make sense to give a percentage score result.
Indeed it could be a good addition to the MLT.

Regards





-
---
Alessandro Benedetti
Search Consultant, R&D Software Engineer, Director
Sease Ltd. - www.sease.io
--
Sent from: http://lucene.472066.n3.nabble.com/Solr-User-f472068.html


RE: [bulk]: Dates and DataImportHandler

2017-09-19 Thread Mannott, Birgit
Hi,

I have a similar problem. I try to change the timezone for the last_index_time 
by setting 



in the  section of my data-config.xml file.

But when doing this I always get a NullPointerException on Delta Import:

2017-09-15 14:04:00.825 INFO  (Thread-2938) [   x:mex_prd_dev1100-ap] 
o.a.s.h.d.DataImporter Starting Delta Import
2017-09-15 14:04:00.827 ERROR (Thread-2938) [   x:mex_prd_dev1100-ap] 
o.a.s.h.d.DataImporter Delta Import Failed
org.apache.solr.handler.dataimport.DataImportHandlerException: Unable to 
PropertyWriter implementation:SimplePropertiesWriter
at 
org.apache.solr.handler.dataimport.DataImporter.createPropertyWriter(DataImporter.java:330)
at 
org.apache.solr.handler.dataimport.DataImporter.doDeltaImport(DataImporter.java:439)
at 
org.apache.solr.handler.dataimport.DataImporter.runCmd(DataImporter.java:476)
at 
org.apache.solr.handler.dataimport.DataImporter.lambda$runAsync$0(DataImporter.java:457)
at java.lang.Thread.run(Thread.java:745)
Caused by: java.lang.NullPointerException
at java.text.SimpleDateFormat.(SimpleDateFormat.java:598)
at 
org.apache.solr.handler.dataimport.SimplePropertiesWriter.init(SimplePropertiesWriter.java:100)
at 
org.apache.solr.handler.dataimport.DataImporter.createPropertyWriter(DataImporter.java:328)
... 4 more

Has anyone an idea what is wrong or missing?

Thanks,
Birgit



-Original Message-
From: Jamie Jackson [mailto:jamieja...@gmail.com] 
Sent: Tuesday, September 19, 2017 3:42 AM
To: solr-user@lucene.apache.org
Subject: [bulk]: Dates and DataImportHandler

Hi folks,

My DB server is on America/Chicago time. Solr (on Docker) is running on UTC. 
Dates coming from my (MariaDB) data source seem to get translated properly into 
the Solr index without me doing anything special.

However when doing delta imports using last_index_time ( 
http://wiki.apache.org/solr/DataImportHandlerDeltaQueryViaFullImport ), I can't 
seem to get the date, which Solr provides, to be understood by the DB as being 
UTC (and translated back, accordingly). In other words, the DB thinks the Solr 
UTC date is local, so it thinks the date is ahead by six hours.

'${dataimporter.request.clean}' != 'false'

or dt > '${dataimporter.last_index_time}'

I came up with this workaround, which seems to work:

'${dataimporter.request.clean}' != 'false'

/* ${user.timezone} is UTC, and the ${custom.dataimporter.datasource.tz}
property is set to America/Chicago */

or dt > CONVERT_TZ('${dataimporter.last_index_time}','${user.timezone}','${
custom.dataimporter.datasource.tz}')

However, isn't there a way for this translation to happen more naturally?

I thought maybe I could do something like this:



The above did set the property as expected (with a trailiing `+`), but that 
didn't seem to help the DB understand/translate the date.

Thanks,
Jamie


Re: Knn classifier doesn't work

2017-09-19 Thread Tommaso Teofili
hi Alessandro,

yes please, feel free to open a Jira issue, patches welcome !

Tommaso

Il giorno lun 18 set 2017 alle ore 14:30 alessandro.benedetti <
a.benede...@sease.io> ha scritto:

> Hi Tommaso,
> you are definitely right!
> I see that the method : MultiFields.getTerms
> returns :
>  if (termsPerLeaf.size() == 0) {
>   return null;
> }
>
> As you correctly mentioned this is not handled in :
>
>
> org/apache/lucene/classification/document/SimpleNaiveBayesDocumentClassifier.java:115
>
> org/apache/lucene/classification/document/SimpleNaiveBayesDocumentClassifier.java:228
> org/apache/lucene/classification/SimpleNaiveBayesClassifier.java:243
>
> Can you do the change or should I open a Jira issue and attach the simple
> patch for you to commit?
> let me know,
>
> Regards
>
>
>
> -
> ---
> Alessandro Benedetti
> Search Consultant, R&D Software Engineer, Director
> Sease Ltd. - www.sease.io
> --
> Sent from: http://lucene.472066.n3.nabble.com/Solr-User-f472068.html
>


Re: Solr nodes crashing (OOM) after 6.6 upgrade

2017-09-19 Thread Emir Arnautović
Hi Shamik,
Can you tell us a bit more about how you use Solr before it OOM. Do you observe 
some heavy indexing or it happens during higher query load. Does memory slowly 
increases or jumps suddenly? Do you have any monitoring tool to see if you can 
correlate some metric with memory increase?
You mentioned that you have doc values on fields used for faceting, but that 
will not save you if you do faceting on high cardinality fields with 
facet.limit=-1&facet.mincount=0 or something similar.

In the worshippers case, you can take a heap dump and see what’s in it.

Regards,
Emir

> On 19 Sep 2017, at 10:11, shamik  wrote:
> 
> Thanks, the change seemed to have addressed the memory issue (so far), but on
> the contrary, the GC chocked the CPUs stalling everything. The CPU
> utilization across the cluster clocked close to 400%, literally stalling
> everything.On a first look, the G1-Old generation looks to be the culprit
> that took up 80% of the CPU. Not sure what triggered really triggered it as
> the GC seemed to have stable till then. The other thing I noticed was the
> mlt queries (I'm using mlt query parser for cloud support) took a huge
> amount of time to respond (10 sec+) during the CPU spike compared to the
> rest. Again, that might just due to the CPU.
> 
> The index might not be a large one to merit a couple of shards, but it has
> never been an issue for past couple of years on 5.5. We never had a single
> outage related to memory or CPU. The query/indexing load has increased over
> time, but it has been linear. I'm little baffled why would 6.6 behave so
> differently. Perhaps the hardware is not adequate enough? I'm running on 8
> core / 30gb machine with SSD.
> 
> 
> 
> --
> Sent from: http://lucene.472066.n3.nabble.com/Solr-User-f472068.html



Re: Solr nodes crashing (OOM) after 6.6 upgrade

2017-09-19 Thread shamik
Thanks, the change seemed to have addressed the memory issue (so far), but on
the contrary, the GC chocked the CPUs stalling everything. The CPU
utilization across the cluster clocked close to 400%, literally stalling
everything.On a first look, the G1-Old generation looks to be the culprit
that took up 80% of the CPU. Not sure what triggered really triggered it as
the GC seemed to have stable till then. The other thing I noticed was the
mlt queries (I'm using mlt query parser for cloud support) took a huge
amount of time to respond (10 sec+) during the CPU spike compared to the
rest. Again, that might just due to the CPU.

The index might not be a large one to merit a couple of shards, but it has
never been an issue for past couple of years on 5.5. We never had a single
outage related to memory or CPU. The query/indexing load has increased over
time, but it has been linear. I'm little baffled why would 6.6 behave so
differently. Perhaps the hardware is not adequate enough? I'm running on 8
core / 30gb machine with SSD.



--
Sent from: http://lucene.472066.n3.nabble.com/Solr-User-f472068.html


Re: Solr nodes crashing (OOM) after 6.6 upgrade

2017-09-19 Thread Toke Eskildsen
On Mon, 2017-09-18 at 20:47 -0700, shamik wrote:
> I did bring down the heap size to 8gb, changed to G1 and reduced the
> cache params. The memory so far has been holding up but will wait for
> a while before passing on a judgment. 

Sounds reasonable.

>  autowarmCount="0"/>
[...]

> The change seemed to have increased the number of slow queries (1000
> ms), but I'm willing to address the OOM over performance at this
> point.

You over-compensated by switching from an enormous cache with excessive
warming to a small cache with no warming. Try setting autowarmCount to
20 or something like that. Also make an explicit warming query that
facets on all your facet-fields, to initialize the underlying
structures.

>  One thing I realized is that I provided the wrong index size here.
> It's 49gb instead of 25, which I mistakenly picked from one shard.

Quite independent from all of this, your index is not a large one; it
might work better for you to store it as a single shard (with
replicas), to avoid the overhead of the distributes processing needed
for multi-shard. The overhead is especially visible when doing a lot of
String faceting.

>  I hope the heap size will continue to sustain for the index size. 

You can check the memory usage in the admin GUI.

- Toke Eskildsen, Royal Danish Library