date:20160406

Shawn, thank you. This was exactly what I was looking for. 

I am already using SolrJ, so the follow two lines did the job:

ZkConfigManager configManager = new 
ZkConfigManager(cloudSolrClient.getZkStateReader().getZkClient());
configManager.uploadConfigDir(Paths.get(configPath), configName);

Thanks


Bosco





On 4/6/16, 5:02 PM, "Shawn Heisey"  wrote:

>On 4/6/2016 3:26 PM, Don Bosco Durai wrote:
>> I want to automate the entire process from my Java process which is not 
>> running on any of the servers were SolrCloud is running. In short, I don’t 
>> have access to bin/solr or server/scripts/cloud-scripts, etc from my 
>> application. So I was wondering if there were any way, like uploading a zip 
>> with the configs (schema.xml, solrconfig.xml, etc.). One workaround I can 
>> thinking is of making direct zookeeper calls.
>
>If you're using a dependency management system like maven or ivy, you
>could probably request solr-core, which would let you use ZkCLI or
>ZkConfigManager directly in your own code.  That would be a VERY
>significant increase in the size of your app, in the form of dependent
>jars.  Although this might work, it's a sledgehammer approach.  There
>are 72 direct dependencies for 5.5.0, and some of those have further
>dependencies.  Some of solr-core's dependencies are quite large.
>
>http://mvnrepository.com/artifact/org.apache.solr/solr-core
>
>You could look at the code for ZkConfigManager and the classes it uses,
>see how they use zookeeper to send configs, and directly implement the
>zookeeper calls required.  Solr implements a wrapper around the
>zookeeper client called SolrZkClient, a wrapping that you might want to
>strip away, so you don't need the solr-core jar and its dependencies. 
>This approach requires the most work.
>
>The way the class inheritance is arranged is terrible for user code that
>wants to do SolrCloud config manipulation.  I'll see if I can come up
>with something to fix that, but it's not going to happen immediately.
>
>An option that approaches the problem from another direction: Copy
>WEB-INF/lib and other things (like log4j.properties and the logging jars
>in server/lib/ext) to somewhere on your client system and run ZkCLI
>directly as an external process from your own code, just like the zkcli
>script does ... or possibly even using a modified zkcli script.  This is
>not as clean as a code-based solution, but it would be relatively easy
>to implement.
>
>Thanks,
>Shawn
>

Re: Adding configset in SolrCloud via API

On 4/6/2016 3:26 PM, Don Bosco Durai wrote:
> I want to automate the entire process from my Java process which is not 
> running on any of the servers were SolrCloud is running. In short, I don’t 
> have access to bin/solr or server/scripts/cloud-scripts, etc from my 
> application. So I was wondering if there were any way, like uploading a zip 
> with the configs (schema.xml, solrconfig.xml, etc.). One workaround I can 
> thinking is of making direct zookeeper calls.

If you're using a dependency management system like maven or ivy, you
could probably request solr-core, which would let you use ZkCLI or
ZkConfigManager directly in your own code.  That would be a VERY
significant increase in the size of your app, in the form of dependent
jars.  Although this might work, it's a sledgehammer approach.  There
are 72 direct dependencies for 5.5.0, and some of those have further
dependencies.  Some of solr-core's dependencies are quite large.

http://mvnrepository.com/artifact/org.apache.solr/solr-core

You could look at the code for ZkConfigManager and the classes it uses,
see how they use zookeeper to send configs, and directly implement the
zookeeper calls required.  Solr implements a wrapper around the
zookeeper client called SolrZkClient, a wrapping that you might want to
strip away, so you don't need the solr-core jar and its dependencies. 
This approach requires the most work.

The way the class inheritance is arranged is terrible for user code that
wants to do SolrCloud config manipulation.  I'll see if I can come up
with something to fix that, but it's not going to happen immediately.

An option that approaches the problem from another direction: Copy
WEB-INF/lib and other things (like log4j.properties and the logging jars
in server/lib/ext) to somewhere on your client system and run ZkCLI
directly as an external process from your own code, just like the zkcli
script does ... or possibly even using a modified zkcli script.  This is
not as clean as a code-based solution, but it would be relatively easy
to implement.

Thanks,
Shawn

Re: Adding configset in SolrCloud via API

Hmmm...Not sure I understand, but it sounds like you've found the best
solution for the limitations you're experiencing...

On Wed, Apr 6, 2016 at 4:38 PM, Don Bosco Durai  wrote:

> My challenge is, the server where my application is running doesn’t have
> Solr bits installed.
>
>
>
>
> Right now I am asking users to install (just unzip) solr on any server and
> I give them a shell script to run the script from command line before
> starting my application. It is inconvenient, so I was seeing if anyone was
> able to automate it.
>
> Thanks
>
> Bosco
>
>
>
>
> On 4/6/16, 2:47 PM, "John Bickerstaff"  wrote:
>
> >Therefore, this becomes possible:
> >
> http://stackoverflow.com/questions/525212/how-to-run-unix-shell-script-from-java-code
> >
> >Hackish, but certainly doable...  Given there's no API...
> >
> >On Wed, Apr 6, 2016 at 3:44 PM, John Bickerstaff <
> j...@johnbickerstaff.com>
> >wrote:
> >
> >> Yup - just tested - that command runs fine with Solr NOT running...
> >>
> >> On Wed, Apr 6, 2016 at 3:41 PM, John Bickerstaff <
> j...@johnbickerstaff.com
> >> > wrote:
> >>
> >>> If you can get to the IP addresses from your application, then there's
> >>> probably a way...  Do you mean you're firewalled off or in some other
> way
> >>> unable to access the Solr box IP's from your Java application?
> >>>
> >>> If you're looking to do "automated build of virtual machines" there are
> >>> some tools like Vagrant...
> >>>
> >>> https://en.wikipedia.org/wiki/Vagrant_(software)
> >>>
> >>> Also, you probably don't need to be directly on one of the 'official"
> >>> SOLR machines to load the configs.  I haven't tested, but as long as
> you
> >>> have the configs and a VM or server running SOLR, you could do this
> >>> yourself.
> >>>
> >>> The following command (as far as I've been able to tell) ONLY creates
> the
> >>> conf "directory" in Zookeeper... nothing else...
> >>>
> >>> And, as a matter of fact, I'm almost positive SOLR did not need to be
> >>> running when I did this, but I did so many variations while trying to
> >>> figure out how to bring up a new 5.4 collection that I'm not positive
> - I
> >>> could be totally wrong on that one.
> >>>
> >>> My point is that I uploaded the configs from a machine that had no
> >>> collection yet...  it worked fine to set up then configs in Zookeeper.
> >>> Later, I issued the create collection command and referenced the
> config in
> >>> Zookeeper with the -n(?) flag...
> >>>
> >>> sudo /opt/solr/server/scripts/cloud-scripts/zkcli.sh -cmd upconfig
> >>> -confdir /home/john/conf/ -confname statdx -z 192.168.56.5/solr5_4
> >>>
> >>> On Wed, Apr 6, 2016 at 3:26 PM, Don Bosco Durai 
> wrote:
> >>>
>  I have SolrCloud pre-installed. I need to create a collection, but
>  before that I need to load the config into zookeeper.
> 
>  I want to automate the entire process from my Java process which is
> not
>  running on any of the servers were SolrCloud is running. In short, I
> don’t
>  have access to bin/solr or server/scripts/cloud-scripts, etc from my
>  application. So I was wondering if there were any way, like uploading
> a zip
>  with the configs (schema.xml, solrconfig.xml, etc.). One workaround I
> can
>  thinking is of making direct zookeeper calls.
> 
>  Anshum, thanks for your reply. I will see if I can find the JIRA.
> 
>  Thanks
> 
> 
>  Bosco
> 
> 
> 
> 
>  On 4/6/16, 2:17 PM, "Erick Erickson"  wrote:
> 
>  >As of Solr 5.5 the bin/solr script can do this, see:
>  >
> 
> https://cwiki.apache.org/confluence/display/solr/Solr+Start+Script+Reference
>  >
>  >It's still not quite what you're looking for, but uploading arbitrary
>  >xml scripts through a browser is a security issue, so it's possible
>  >there will never be an API call to do that.
>  >
>  >Best
>  >Erick
>  >
>  >On Wed, Apr 6, 2016 at 1:52 PM, Anshum Gupta  >
>  wrote:
>  >> As of now, there's no way to do so. There were some efforts on
> those
>  lines but it's been on hold.
>  >>
>  >> -Anshum
>  >>
>  >>> On Apr 6, 2016, at 12:21 PM, Don Bosco Durai 
>  wrote:
>  >>>
>  >>> Is there an equivalent of server/scripts/cloud-scripts/zkcli.sh
>  -zkhost $zk_host -cmd upconfig -confdir $config_folder -confname
>  $config_name using APIs?
>  >>>
>  >>> I want to bootstrap by uploading the configs via API. Once the
>  configs are uploaded, I am now able to do everything else via API.
>  >>>
>  >>> Thanks
>  >>>
>  >>> Bosco
>  >>>
>  >>>
> 
> 
> >>>
> >>
>
>

Re: Adding configset in SolrCloud via API

My challenge is, the server where my application is running doesn’t have Solr 
bits installed. 




Right now I am asking users to install (just unzip) solr on any server and I 
give them a shell script to run the script from command line before starting my 
application. It is inconvenient, so I was seeing if anyone was able to automate 
it.

Thanks

Bosco




On 4/6/16, 2:47 PM, "John Bickerstaff"  wrote:

>Therefore, this becomes possible:
>http://stackoverflow.com/questions/525212/how-to-run-unix-shell-script-from-java-code
>
>Hackish, but certainly doable...  Given there's no API...
>
>On Wed, Apr 6, 2016 at 3:44 PM, John Bickerstaff 
>wrote:
>
>> Yup - just tested - that command runs fine with Solr NOT running...
>>
>> On Wed, Apr 6, 2016 at 3:41 PM, John Bickerstaff > > wrote:
>>
>>> If you can get to the IP addresses from your application, then there's
>>> probably a way...  Do you mean you're firewalled off or in some other way
>>> unable to access the Solr box IP's from your Java application?
>>>
>>> If you're looking to do "automated build of virtual machines" there are
>>> some tools like Vagrant...
>>>
>>> https://en.wikipedia.org/wiki/Vagrant_(software)
>>>
>>> Also, you probably don't need to be directly on one of the 'official"
>>> SOLR machines to load the configs.  I haven't tested, but as long as you
>>> have the configs and a VM or server running SOLR, you could do this
>>> yourself.
>>>
>>> The following command (as far as I've been able to tell) ONLY creates the
>>> conf "directory" in Zookeeper... nothing else...
>>>
>>> And, as a matter of fact, I'm almost positive SOLR did not need to be
>>> running when I did this, but I did so many variations while trying to
>>> figure out how to bring up a new 5.4 collection that I'm not positive - I
>>> could be totally wrong on that one.
>>>
>>> My point is that I uploaded the configs from a machine that had no
>>> collection yet...  it worked fine to set up then configs in Zookeeper.
>>> Later, I issued the create collection command and referenced the config in
>>> Zookeeper with the -n(?) flag...
>>>
>>> sudo /opt/solr/server/scripts/cloud-scripts/zkcli.sh -cmd upconfig
>>> -confdir /home/john/conf/ -confname statdx -z 192.168.56.5/solr5_4
>>>
>>> On Wed, Apr 6, 2016 at 3:26 PM, Don Bosco Durai  wrote:
>>>
 I have SolrCloud pre-installed. I need to create a collection, but
 before that I need to load the config into zookeeper.

 I want to automate the entire process from my Java process which is not
 running on any of the servers were SolrCloud is running. In short, I don’t
 have access to bin/solr or server/scripts/cloud-scripts, etc from my
 application. So I was wondering if there were any way, like uploading a zip
 with the configs (schema.xml, solrconfig.xml, etc.). One workaround I can
 thinking is of making direct zookeeper calls.

 Anshum, thanks for your reply. I will see if I can find the JIRA.

 Thanks


 Bosco




 On 4/6/16, 2:17 PM, "Erick Erickson"  wrote:

 >As of Solr 5.5 the bin/solr script can do this, see:
 >
 https://cwiki.apache.org/confluence/display/solr/Solr+Start+Script+Reference
 >
 >It's still not quite what you're looking for, but uploading arbitrary
 >xml scripts through a browser is a security issue, so it's possible
 >there will never be an API call to do that.
 >
 >Best
 >Erick
 >
 >On Wed, Apr 6, 2016 at 1:52 PM, Anshum Gupta 
 wrote:
 >> As of now, there's no way to do so. There were some efforts on those
 lines but it's been on hold.
 >>
 >> -Anshum
 >>
 >>> On Apr 6, 2016, at 12:21 PM, Don Bosco Durai 
 wrote:
 >>>
 >>> Is there an equivalent of server/scripts/cloud-scripts/zkcli.sh
 -zkhost $zk_host -cmd upconfig -confdir $config_folder -confname
 $config_name using APIs?
 >>>
 >>> I want to bootstrap by uploading the configs via API. Once the
 configs are uploaded, I am now able to do everything else via API.
 >>>
 >>> Thanks
 >>>
 >>> Bosco
 >>>
 >>>


>>>
>>

Re: using solr AnalyticsQuery API vs facet API

2016-04-06 Thread sudsport s

Adding Yonik,

I almost implemented custom aggregate function using new facet API but
later on got runtime exceptions as "FacetContext" is not public. so looks
like Facet api components can't be created as external plugins.

I am successful using AnalyticsQueryAPI to perform what I want.

Yonik can you help me figure about differences between Solr analytics API
vs JSON Facet API?




On Fri, Mar 18, 2016 at 8:52 AM, sudsport s  wrote:

> Thanks Joel for responding.
> but I am still not sure when to use  Solr analytics API i vs  JSON facet API
> (What is difference between ValueSource vs PostFilter)
>
> I know that ValueSource is useful to implement functions.
>
> On Wed, Mar 16, 2016 at 9:49 AM, sudsport s  wrote:
>
>> Hi ,
>>
>> I am planning to write custom aggregator in solr which will use some
>> probabilistic data structures per shard to accumate results and then after
>> shard merging results will be sent to user as integer.
>>
>> I explored 2 options to do this
>>
>> 1. Solr analytics API
>> https://cwiki.apache.org/confluence/display/solr/AnalyticsQuery+API
>>
>> I can implement merge policy and post filter to perform aggregation , I
>> have example working using this , but I am not sure if it is ok to pass
>> objects which > 1 MB in shard response?
>> does solr use javabin serialization to optimize data gathering from
>> shards?
>> then leader shard will collect these 1 MB probabilistic data structures &
>> produce count which will be included in response.
>>
>>
>> 2. JSON Facet API  http://yonik.com/json-facet-api/
>>
>> After looking at
>> https://github.com/apache/lucene-solr/tree/master/solr/core/src/java/org/apache/solr/search/facet
>>
>> FacetProcessor.java seems very similar to Solr analytics API.
>> seems like merging happens similar way where response will include
>> objects like hll and merge them
>>
>>
>>
>>
>>
>> one key difference is  Solr analytics API is based on postFilter and JSON
>> facet API is based on ValueSource but I dont understand impact of using one
>> or the other.
>>
>>
>> Can someone help me out?
>>
>>
>>
>

Re: Adding configset in SolrCloud via API

Therefore, this becomes possible:
http://stackoverflow.com/questions/525212/how-to-run-unix-shell-script-from-java-code

Hackish, but certainly doable...  Given there's no API...

On Wed, Apr 6, 2016 at 3:44 PM, John Bickerstaff 
wrote:

> Yup - just tested - that command runs fine with Solr NOT running...
>
> On Wed, Apr 6, 2016 at 3:41 PM, John Bickerstaff  > wrote:
>
>> If you can get to the IP addresses from your application, then there's
>> probably a way...  Do you mean you're firewalled off or in some other way
>> unable to access the Solr box IP's from your Java application?
>>
>> If you're looking to do "automated build of virtual machines" there are
>> some tools like Vagrant...
>>
>> https://en.wikipedia.org/wiki/Vagrant_(software)
>>
>> Also, you probably don't need to be directly on one of the 'official"
>> SOLR machines to load the configs.  I haven't tested, but as long as you
>> have the configs and a VM or server running SOLR, you could do this
>> yourself.
>>
>> The following command (as far as I've been able to tell) ONLY creates the
>> conf "directory" in Zookeeper... nothing else...
>>
>> And, as a matter of fact, I'm almost positive SOLR did not need to be
>> running when I did this, but I did so many variations while trying to
>> figure out how to bring up a new 5.4 collection that I'm not positive - I
>> could be totally wrong on that one.
>>
>> My point is that I uploaded the configs from a machine that had no
>> collection yet...  it worked fine to set up then configs in Zookeeper.
>> Later, I issued the create collection command and referenced the config in
>> Zookeeper with the -n(?) flag...
>>
>> sudo /opt/solr/server/scripts/cloud-scripts/zkcli.sh -cmd upconfig
>> -confdir /home/john/conf/ -confname statdx -z 192.168.56.5/solr5_4
>>
>> On Wed, Apr 6, 2016 at 3:26 PM, Don Bosco Durai  wrote:
>>
>>> I have SolrCloud pre-installed. I need to create a collection, but
>>> before that I need to load the config into zookeeper.
>>>
>>> I want to automate the entire process from my Java process which is not
>>> running on any of the servers were SolrCloud is running. In short, I don’t
>>> have access to bin/solr or server/scripts/cloud-scripts, etc from my
>>> application. So I was wondering if there were any way, like uploading a zip
>>> with the configs (schema.xml, solrconfig.xml, etc.). One workaround I can
>>> thinking is of making direct zookeeper calls.
>>>
>>> Anshum, thanks for your reply. I will see if I can find the JIRA.
>>>
>>> Thanks
>>>
>>>
>>> Bosco
>>>
>>>
>>>
>>>
>>> On 4/6/16, 2:17 PM, "Erick Erickson"  wrote:
>>>
>>> >As of Solr 5.5 the bin/solr script can do this, see:
>>> >
>>> https://cwiki.apache.org/confluence/display/solr/Solr+Start+Script+Reference
>>> >
>>> >It's still not quite what you're looking for, but uploading arbitrary
>>> >xml scripts through a browser is a security issue, so it's possible
>>> >there will never be an API call to do that.
>>> >
>>> >Best
>>> >Erick
>>> >
>>> >On Wed, Apr 6, 2016 at 1:52 PM, Anshum Gupta 
>>> wrote:
>>> >> As of now, there's no way to do so. There were some efforts on those
>>> lines but it's been on hold.
>>> >>
>>> >> -Anshum
>>> >>
>>> >>> On Apr 6, 2016, at 12:21 PM, Don Bosco Durai 
>>> wrote:
>>> >>>
>>> >>> Is there an equivalent of server/scripts/cloud-scripts/zkcli.sh
>>> -zkhost $zk_host -cmd upconfig -confdir $config_folder -confname
>>> $config_name using APIs?
>>> >>>
>>> >>> I want to bootstrap by uploading the configs via API. Once the
>>> configs are uploaded, I am now able to do everything else via API.
>>> >>>
>>> >>> Thanks
>>> >>>
>>> >>> Bosco
>>> >>>
>>> >>>
>>>
>>>
>>
>

Re: Adding configset in SolrCloud via API

Yup - just tested - that command runs fine with Solr NOT running...

On Wed, Apr 6, 2016 at 3:41 PM, John Bickerstaff 
wrote:

> If you can get to the IP addresses from your application, then there's
> probably a way...  Do you mean you're firewalled off or in some other way
> unable to access the Solr box IP's from your Java application?
>
> If you're looking to do "automated build of virtual machines" there are
> some tools like Vagrant...
>
> https://en.wikipedia.org/wiki/Vagrant_(software)
>
> Also, you probably don't need to be directly on one of the 'official" SOLR
> machines to load the configs.  I haven't tested, but as long as you have
> the configs and a VM or server running SOLR, you could do this yourself.
>
> The following command (as far as I've been able to tell) ONLY creates the
> conf "directory" in Zookeeper... nothing else...
>
> And, as a matter of fact, I'm almost positive SOLR did not need to be
> running when I did this, but I did so many variations while trying to
> figure out how to bring up a new 5.4 collection that I'm not positive - I
> could be totally wrong on that one.
>
> My point is that I uploaded the configs from a machine that had no
> collection yet...  it worked fine to set up then configs in Zookeeper.
> Later, I issued the create collection command and referenced the config in
> Zookeeper with the -n(?) flag...
>
> sudo /opt/solr/server/scripts/cloud-scripts/zkcli.sh -cmd upconfig
> -confdir /home/john/conf/ -confname statdx -z 192.168.56.5/solr5_4
>
> On Wed, Apr 6, 2016 at 3:26 PM, Don Bosco Durai  wrote:
>
>> I have SolrCloud pre-installed. I need to create a collection, but before
>> that I need to load the config into zookeeper.
>>
>> I want to automate the entire process from my Java process which is not
>> running on any of the servers were SolrCloud is running. In short, I don’t
>> have access to bin/solr or server/scripts/cloud-scripts, etc from my
>> application. So I was wondering if there were any way, like uploading a zip
>> with the configs (schema.xml, solrconfig.xml, etc.). One workaround I can
>> thinking is of making direct zookeeper calls.
>>
>> Anshum, thanks for your reply. I will see if I can find the JIRA.
>>
>> Thanks
>>
>>
>> Bosco
>>
>>
>>
>>
>> On 4/6/16, 2:17 PM, "Erick Erickson"  wrote:
>>
>> >As of Solr 5.5 the bin/solr script can do this, see:
>> >
>> https://cwiki.apache.org/confluence/display/solr/Solr+Start+Script+Reference
>> >
>> >It's still not quite what you're looking for, but uploading arbitrary
>> >xml scripts through a browser is a security issue, so it's possible
>> >there will never be an API call to do that.
>> >
>> >Best
>> >Erick
>> >
>> >On Wed, Apr 6, 2016 at 1:52 PM, Anshum Gupta 
>> wrote:
>> >> As of now, there's no way to do so. There were some efforts on those
>> lines but it's been on hold.
>> >>
>> >> -Anshum
>> >>
>> >>> On Apr 6, 2016, at 12:21 PM, Don Bosco Durai 
>> wrote:
>> >>>
>> >>> Is there an equivalent of server/scripts/cloud-scripts/zkcli.sh
>> -zkhost $zk_host -cmd upconfig -confdir $config_folder -confname
>> $config_name using APIs?
>> >>>
>> >>> I want to bootstrap by uploading the configs via API. Once the
>> configs are uploaded, I am now able to do everything else via API.
>> >>>
>> >>> Thanks
>> >>>
>> >>> Bosco
>> >>>
>> >>>
>>
>>
>

Re: Adding configset in SolrCloud via API

If you can get to the IP addresses from your application, then there's
probably a way...  Do you mean you're firewalled off or in some other way
unable to access the Solr box IP's from your Java application?

If you're looking to do "automated build of virtual machines" there are
some tools like Vagrant...

https://en.wikipedia.org/wiki/Vagrant_(software)

Also, you probably don't need to be directly on one of the 'official" SOLR
machines to load the configs.  I haven't tested, but as long as you have
the configs and a VM or server running SOLR, you could do this yourself.

The following command (as far as I've been able to tell) ONLY creates the
conf "directory" in Zookeeper... nothing else...

And, as a matter of fact, I'm almost positive SOLR did not need to be
running when I did this, but I did so many variations while trying to
figure out how to bring up a new 5.4 collection that I'm not positive - I
could be totally wrong on that one.

My point is that I uploaded the configs from a machine that had no
collection yet...  it worked fine to set up then configs in Zookeeper.
Later, I issued the create collection command and referenced the config in
Zookeeper with the -n(?) flag...

sudo /opt/solr/server/scripts/cloud-scripts/zkcli.sh -cmd upconfig -confdir
/home/john/conf/ -confname statdx -z 192.168.56.5/solr5_4

On Wed, Apr 6, 2016 at 3:26 PM, Don Bosco Durai  wrote:

> I have SolrCloud pre-installed. I need to create a collection, but before
> that I need to load the config into zookeeper.
>
> I want to automate the entire process from my Java process which is not
> running on any of the servers were SolrCloud is running. In short, I don’t
> have access to bin/solr or server/scripts/cloud-scripts, etc from my
> application. So I was wondering if there were any way, like uploading a zip
> with the configs (schema.xml, solrconfig.xml, etc.). One workaround I can
> thinking is of making direct zookeeper calls.
>
> Anshum, thanks for your reply. I will see if I can find the JIRA.
>
> Thanks
>
>
> Bosco
>
>
>
>
> On 4/6/16, 2:17 PM, "Erick Erickson"  wrote:
>
> >As of Solr 5.5 the bin/solr script can do this, see:
> >
> https://cwiki.apache.org/confluence/display/solr/Solr+Start+Script+Reference
> >
> >It's still not quite what you're looking for, but uploading arbitrary
> >xml scripts through a browser is a security issue, so it's possible
> >there will never be an API call to do that.
> >
> >Best
> >Erick
> >
> >On Wed, Apr 6, 2016 at 1:52 PM, Anshum Gupta 
> wrote:
> >> As of now, there's no way to do so. There were some efforts on those
> lines but it's been on hold.
> >>
> >> -Anshum
> >>
> >>> On Apr 6, 2016, at 12:21 PM, Don Bosco Durai  wrote:
> >>>
> >>> Is there an equivalent of server/scripts/cloud-scripts/zkcli.sh
> -zkhost $zk_host -cmd upconfig -confdir $config_folder -confname
> $config_name using APIs?
> >>>
> >>> I want to bootstrap by uploading the configs via API. Once the configs
> are uploaded, I am now able to do everything else via API.
> >>>
> >>> Thanks
> >>>
> >>> Bosco
> >>>
> >>>
>
>

Re: Saving Solr filter query.

Right...  You can store that anywhere - but at least consider not storing
it in your existing SOLR collection just because it's there...  It's not
really the same kind of data -- it's application meta-data and/or
user-specific data...

Getting it out later will be more difficult than if you store it
separately...

A relational data store is ideal for this kind of thing - especially if
it's associated with a user and you're already storing user data in an
RDBMS.  It's just a string...

Alternatively, a "user" document or a "saved search" document in another
Solr collection could hold the data...

On Wed, Apr 6, 2016 at 3:22 PM, Erick Erickson 
wrote:

> That's more of an app-level feature, there's nothing in Solr that does
> this for you.
>
> Some people have used a different Solr collection to store the queries
> as strings for display, but that's again something you build on top of
> Solr, not a core feature.
>
> Best,
> Erick
>
> On Wed, Apr 6, 2016 at 2:32 AM, Pritam Kute
>  wrote:
> > Hi,
> >
> > I have designed one web page on which user can search and filter his data
> > based on some term facets. I am using Apache Solr 5.3.1 for the same. It
> is
> > working perfectly fine.
> >
> > Now my requirement is to save the query which I have executed on Solr,
> so,
> > in future, if I need to search the same results, I have to just extract
> the
> > saved query and make a query to Solr server (I mean the feature like
> saving
> > the favorite filters).
> >
> > Any help would be useful. Thanks in advance.
> >
> > Thanks & Regards,
> > --
> > *Pritam Kute*
>

Re: Adding configset in SolrCloud via API

I have SolrCloud pre-installed. I need to create a collection, but before that 
I need to load the config into zookeeper. 

I want to automate the entire process from my Java process which is not running 
on any of the servers were SolrCloud is running. In short, I don’t have access 
to bin/solr or server/scripts/cloud-scripts, etc from my application. So I was 
wondering if there were any way, like uploading a zip with the configs 
(schema.xml, solrconfig.xml, etc.). One workaround I can thinking is of making 
direct zookeeper calls.

Anshum, thanks for your reply. I will see if I can find the JIRA.

Thanks

Bosco

On 4/6/16, 2:17 PM, "Erick Erickson"  wrote:

>As of Solr 5.5 the bin/solr script can do this, see:
>https://cwiki.apache.org/confluence/display/solr/Solr+Start+Script+Reference
>
>It's still not quite what you're looking for, but uploading arbitrary
>xml scripts through a browser is a security issue, so it's possible
>there will never be an API call to do that.
>
>Best
>Erick
>
>On Wed, Apr 6, 2016 at 1:52 PM, Anshum Gupta  wrote:
>> As of now, there's no way to do so. There were some efforts on those lines 
>> but it's been on hold.
>>
>> -Anshum
>>
>>> On Apr 6, 2016, at 12:21 PM, Don Bosco Durai  wrote:
>>>
>>> Is there an equivalent of server/scripts/cloud-scripts/zkcli.sh  -zkhost 
>>> $zk_host -cmd upconfig -confdir $config_folder -confname $config_name using 
>>> APIs?
>>>
>>> I want to bootstrap by uploading the configs via API. Once the configs are 
>>> uploaded, I am now able to do everything else via API.
>>>
>>> Thanks
>>>
>>> Bosco
>>>
>>>

Re: Saving Solr filter query.

2016-04-06 Thread Erick Erickson

That's more of an app-level feature, there's nothing in Solr that does
this for you.

Some people have used a different Solr collection to store the queries
as strings for display, but that's again something you build on top of
Solr, not a core feature.

Best,
Erick

On Wed, Apr 6, 2016 at 2:32 AM, Pritam Kute
 wrote:
> Hi,
>
> I have designed one web page on which user can search and filter his data
> based on some term facets. I am using Apache Solr 5.3.1 for the same. It is
> working perfectly fine.
>
> Now my requirement is to save the query which I have executed on Solr, so,
> in future, if I need to search the same results, I have to just extract the
> saved query and make a query to Solr server (I mean the feature like saving
> the favorite filters).
>
> Any help would be useful. Thanks in advance.
>
> Thanks & Regards,
> --
> *Pritam Kute*

Re: Update Speed: QTime 1,000 - 5,000

2016-04-06 Thread Erick Erickson

you can mitigate the impact of throwing away caches on soft commits by
doing appropriate autowarming, both the newSearcher and cache settings
in solrconfig.xml.

Be aware that you don't want to go overboard here, I'd start with 20
or so as the autowarm counts for queryResultCache and filterCache.

And if you ever get warnings about "too many on deck searchers", your
commit intervals are too short or your autowarm is too long. Do not
try to fix this error by bumping the maxWarmingSearchers in
solrconfig.xml.

Best,
Erick

On Wed, Apr 6, 2016 at 3:49 AM, Alessandro Benedetti
 wrote:
> On Wed, Apr 6, 2016 at 7:53 AM, Robert Brown  wrote:
>
>> The QTime's are from the updates.
>>
>> We don't have the resource right now to switch to SolrJ, but I would
>> assume only sending updates to the leaders would take some redirects out of
>> the process,
>
> How do you route your documents now ?
> Aren't you using Solr routing ?
>
>
>> I can regularly query for the collection status to know who's who.
>>
>> I'm now more interested in the caches that are thrown away on softCommit,
>> since we do see some performance issues on queries too. Would these caches
>> affect querying and faceting?
>>
>
> You should check your caches stats and performances.
> Filter Cache could be heavily involved in querying and faceting .
> Query Result Cache as the name says woul affect the query results fetching
> as well.
> Document cache will impact in fetching what you display for the documents.
> Much more could be discussed about caching, a good start would be to verify
> how currently your caches are configured and how they are currently
> performing.
>
> Cheers
>
>
>>
>> Thanks,
>> Rob
>>
>>
>>
>>
>> On 06/04/16 00:41, Erick Erickson wrote:
>>
>>> bq: Apart from the obvious delay, I'm also seeing QTime's of 1,000 to
>>> 5,000
>>>
>>> QTimes for what? The update? Queries? If for queries, autowarming may
>>> help,
>>> especially as your soft commit is throwing away all the top-level
>>> caches (i.e. the
>>> ones configured in solrconfig.xml) every minute. It shouldn't be that bad
>>> on the
>>> lower-level Lucene caches though, at least the per-segment ones.
>>>
>>> You'll get some improvement by using SolrJ (with CloudSolrClient)
>>> rather than cURL.
>>> no matter which node you hit, about half your documents will have to
>>> be forwarded to
>>> the other shard when using cURL, whereas SolrJ (with CloudSolrClient)
>>> will route the docs
>>> to the correct leader right from the client.
>>>
>>> Best,
>>> Erick
>>>
>>> On Tue, Apr 5, 2016 at 2:53 PM, John Bickerstaff
>>>  wrote:
>>>
 A few thoughts...

  From a black-box testing perspective, you might try changing that
 softCommit time frame  to something longer and see if it makes a
 difference.

 The size of  your documents will make a difference too - so the
 comparison
 to 300 - 500 on other cloud setups may or may not be comparing apples to
 oranges...

 Are the "new" documents actually new or are you overwriting existing solr
 doc ID's?  If you are overwriting, you may want to optimize and see if
 that
 helps.



 On Tue, Apr 5, 2016 at 2:38 PM, Robert Brown 
 wrote:

 Hi,
>
> I'm currently posting updates via cURL, in batches of 1,000 docs in JSON
> files.
>
> My setup consists of 2 shards, 1 replica each, 50m docs in total.
>
> These updates are hitting a node at random, from a server across the
> Internet.
>
> Apart from the obvious delay, I'm also seeing QTime's of 1,000 to 5,000.
>
> This strikes me as quite high since I also sometimes see times of around
> 300-500, on similar cloud setups.
>
> The setup is running on VMs with rotary disks, and enough RAM to hold
> roughly half the entire index in disk cache (I'm in the process of
> upgrading this).
>
> I hard commit every 10 minutes but don't open a new searcher, just to
> make
> sure data is "safe".  I softCommit every 1 minute to make data
> available.
>
> Are there any obvious things I can do to improve my situation?
>
> Thanks,
> Rob
>
>
>
>
>
>
>>
>
>
> --
> --
>
> Benedetti Alessandro
> Visiting card : http://about.me/alessandro_benedetti
>
> "Tyger, tyger burning bright
> In the forests of the night,
> What immortal hand or eye
> Could frame thy fearful symmetry?"
>
> William Blake - Songs of Experience -1794 England

Re: Adding configset in SolrCloud via API

2016-04-06 Thread Erick Erickson

As of Solr 5.5 the bin/solr script can do this, see:
https://cwiki.apache.org/confluence/display/solr/Solr+Start+Script+Reference

It's still not quite what you're looking for, but uploading arbitrary
xml scripts through a browser is a security issue, so it's possible
there will never be an API call to do that.

Best
Erick

On Wed, Apr 6, 2016 at 1:52 PM, Anshum Gupta  wrote:
> As of now, there's no way to do so. There were some efforts on those lines 
> but it's been on hold.
>
> -Anshum
>
>> On Apr 6, 2016, at 12:21 PM, Don Bosco Durai  wrote:
>>
>> Is there an equivalent of server/scripts/cloud-scripts/zkcli.sh  -zkhost 
>> $zk_host -cmd upconfig -confdir $config_folder -confname $config_name using 
>> APIs?
>>
>> I want to bootstrap by uploading the configs via API. Once the configs are 
>> uploaded, I am now able to do everything else via API.
>>
>> Thanks
>>
>> Bosco
>>
>>

Re: Adding configset in SolrCloud via API

2016-04-06 Thread Anshum Gupta

As of now, there's no way to do so. There were some efforts on those lines but 
it's been on hold.

-Anshum

> On Apr 6, 2016, at 12:21 PM, Don Bosco Durai  wrote:
> 
> Is there an equivalent of server/scripts/cloud-scripts/zkcli.sh  -zkhost 
> $zk_host -cmd upconfig -confdir $config_folder -confname $config_name using 
> APIs?
> 
> I want to bootstrap by uploading the configs via API. Once the configs are 
> uploaded, I am now able to do everything else via API.
> 
> Thanks
> 
> Bosco
> 
>

Re: MLT Query Parser

On 4/6/2016 11:07 AM, shamik wrote:
> Thanks Alessandro, that answers my doubt. in a nutshell, to make MLT Query
> parser work, you need to know the document id. I'm just curious as why this
> constraint has been added. This will not work for a bulk of use cases. For
> e.g. if we are trying to generate MLT based on a text or a keyword, how
> would I ever use this API ? My initial impression was that this was designed
> to work on a distributed mode.
>
> Now, this adds up a follow-up question as in which one is the right approach
> in a solr cloud mode. "mlt"request handler is off the equation since it's
> not supported. That leaves with MoreLikeThisComponent which has a known
> issue with performance. Is that the only availble solution then ?

The feature "More Like This/These" is built around the premise that you
are seeing one or more existing documents in a search result, and you
want to find more documents that are very similar to those.  This is why
you need an ID -- to tell Solr which document(s) you want to use as a
basis for the query.

If you're trying to find documents by keyword/text, that's just a
regular query; you don't need MLT.

MLT was not originally designed for distributed indexes.  Distributed
MLT is a *relatively* recent addition, and when I tried it, it was very
slow.  I do not know if the feature has changed much since it was
introduced three years ago.

Thanks,
Shawn

Re: How to implement Autosuggestion

2016-04-06 Thread chandan khatri

Hi Alessandro,

Thanks for replying!

Here are my answers inline.

1. "First of all, simple string autosuggestion or document autosuggestion ?
(
with more additional field to show then the label)


 Document autosuggestions
2. Are you interested in the analysis for the text to suggest ? Fuzzy
suggestions ? exact "beginning of the phrase" suggestions ? infix
suggestions ?"

--- I am interested in analysis for the text to suggest.

3. In the case you want to show the label AND the category AND whatever ( in
Amazon style to make it simple) .
A very straighforward solution is to model a specific Solr Field for your
product collection.
This field will contain the name of the product, analyzed according your
need.
Then your autosuggester will simply hit that field on each char typed, and
you can show the entire document in the suggestions ( with all the fields
you want) .

 Not sure what you mean by indexing a field with the name of product
analyzed according to my need? Can you please give an example?

Thanks again for taking time to respond to my query.

On Thu, Apr 7, 2016 at 12:59 AM, chandan khatri 
wrote:

> Hi Alessandro,
>
> Thanks for replying!
>
> Here are my answers inline.
>
>
>
>
> On Mon, Apr 4, 2016 at 6:34 PM, Alessandro Benedetti <
> abenede...@apache.org> wrote:
>
>> Hi Chandan,
>> I will answer as my previous answer to a similar topic that got lost :
>> "First of all, simple string autosuggestion or document autosuggestion ? (
>> with more additional field to show then the label)
>> Are you interested in the analysis for the text to suggest ? Fuzzy
>> suggestions ? exact "beginning of the phrase" suggestions ? infix
>> suggestions ?"
>>
>> If you need only the category *payloadField* should be what you need .
>> I never used it as a feature but it is there [1] .
>> As Reth suggested, at the moment Solr supports only one payloadField,
>> ignoring the others ( code confirms this).
>>
>> In the case you want to show the label AND the category AND whatever ( in
>> Amazon style to make it simple) .
>> A very straighforward solution is to model a specific Solr Field for your
>> product collection.
>> This field will contain the name of the product, analyzed according your
>> need.
>> Then your autosuggester will simply hit that field on each char typed, and
>> you can show the entire document in the suggestions ( with all the fields
>> you want) .
>> Or we take a look to the implementation and we contribute the support for
>> multiple *payloadField*.
>>
>> Cheers
>>
>> [1] https://cwiki.apache.org/confluence/display/solr/Suggester
>>
>> On Sun, Apr 3, 2016 at 1:09 PM, Reth RM  wrote:
>>
>> > There is a payload attribute but I'm not sure if this can be used for
>> such
>> > use case. Lets wait for others contributors to confirm.
>> > Similar question posted here:
>> >
>> >
>> http://stackoverflow.com/questions/32434186/solr-suggestion-with-multiple-payloads
>> > .
>> >
>> > If its just a category that you need then the work around(although not
>> > accurate one) that I can think of is to include the category value to
>> the
>> > same field with pipe separation and extract from it?
>> >
>> > On Sun, Apr 3, 2016 at 11:41 AM, chandan khatri <
>> chandankhat...@gmail.com>
>> > wrote:
>> >
>> > > Hi All,
>> > >
>> > > I've a query regarding autosuggestion. My use case is as below:
>> > >
>> > > 1. User enters product name (say Nokia)
>> > > 2. I want suggestions along with the category with which the product
>> > > belongs. (e.g Nokia belongs to "electronics" and "mobile" category)
>> so I
>> > > want suggestion like Nokia in electronics and Nokia in mobile.
>> > >
>> > > I am able to get the suggestions using the OOTB
>> AnalyzingInFixSuggester
>> > but
>> > > not sure how I can get the category along with the suggestion(can this
>> > > category be considered as facet of the suggestion??)
>> > >
>> > > Any help/pointer is highly appreciated.
>> > >
>> > > Thanks,
>> > > Chandan
>> > >
>> >
>>
>>
>>
>> --
>> --
>>
>> Benedetti Alessandro
>> Visiting card : http://about.me/alessandro_benedetti
>>
>> "Tyger, tyger burning bright
>> In the forests of the night,
>> What immortal hand or eye
>> Could frame thy fearful symmetry?"
>>
>> William Blake - Songs of Experience -1794 England
>>
>
>

Re: How to implement Autosuggestion

2016-04-06 Thread chandan khatri

Hi Alessandro,

Thanks for replying!

Here are my answers inline.




On Mon, Apr 4, 2016 at 6:34 PM, Alessandro Benedetti 
wrote:

> Hi Chandan,
> I will answer as my previous answer to a similar topic that got lost :
> "First of all, simple string autosuggestion or document autosuggestion ? (
> with more additional field to show then the label)
> Are you interested in the analysis for the text to suggest ? Fuzzy
> suggestions ? exact "beginning of the phrase" suggestions ? infix
> suggestions ?"
>
> If you need only the category *payloadField* should be what you need .
> I never used it as a feature but it is there [1] .
> As Reth suggested, at the moment Solr supports only one payloadField,
> ignoring the others ( code confirms this).
>
> In the case you want to show the label AND the category AND whatever ( in
> Amazon style to make it simple) .
> A very straighforward solution is to model a specific Solr Field for your
> product collection.
> This field will contain the name of the product, analyzed according your
> need.
> Then your autosuggester will simply hit that field on each char typed, and
> you can show the entire document in the suggestions ( with all the fields
> you want) .
> Or we take a look to the implementation and we contribute the support for
> multiple *payloadField*.
>
> Cheers
>
> [1] https://cwiki.apache.org/confluence/display/solr/Suggester
>
> On Sun, Apr 3, 2016 at 1:09 PM, Reth RM  wrote:
>
> > There is a payload attribute but I'm not sure if this can be used for
> such
> > use case. Lets wait for others contributors to confirm.
> > Similar question posted here:
> >
> >
> http://stackoverflow.com/questions/32434186/solr-suggestion-with-multiple-payloads
> > .
> >
> > If its just a category that you need then the work around(although not
> > accurate one) that I can think of is to include the category value to the
> > same field with pipe separation and extract from it?
> >
> > On Sun, Apr 3, 2016 at 11:41 AM, chandan khatri <
> chandankhat...@gmail.com>
> > wrote:
> >
> > > Hi All,
> > >
> > > I've a query regarding autosuggestion. My use case is as below:
> > >
> > > 1. User enters product name (say Nokia)
> > > 2. I want suggestions along with the category with which the product
> > > belongs. (e.g Nokia belongs to "electronics" and "mobile" category) so
> I
> > > want suggestion like Nokia in electronics and Nokia in mobile.
> > >
> > > I am able to get the suggestions using the OOTB AnalyzingInFixSuggester
> > but
> > > not sure how I can get the category along with the suggestion(can this
> > > category be considered as facet of the suggestion??)
> > >
> > > Any help/pointer is highly appreciated.
> > >
> > > Thanks,
> > > Chandan
> > >
> >
>
>
>
> --
> --
>
> Benedetti Alessandro
> Visiting card : http://about.me/alessandro_benedetti
>
> "Tyger, tyger burning bright
> In the forests of the night,
> What immortal hand or eye
> Could frame thy fearful symmetry?"
>
> William Blake - Songs of Experience -1794 England
>

Adding configset in SolrCloud via API

Is there an equivalent of server/scripts/cloud-scripts/zkcli.sh  -zkhost 
$zk_host -cmd upconfig -confdir $config_folder -confname $config_name using 
APIs?

I want to bootstrap by uploading the configs via API. Once the configs are 
uploaded, I am now able to do everything else via API.

Thanks

Bosco

Re: Contrib module for Document Clustering

2016-04-06 Thread Joel Bernstein

I don't know of any contrib or module that does this. Can you describe why
you'd want to route documents to shards based on similarity? What
advantages would you get by using this approach?

Joel Bernstein
http://joelsolr.blogspot.com/

On Wed, Apr 6, 2016 at 1:36 PM, davidphilip cherian <
davidphilipcher...@gmail.com> wrote:

> Any thoughts?
>
>
> On Tue, Apr 5, 2016 at 9:05 PM, davidphilip cherian <
> davidphilipcher...@gmail.com> wrote:
>
> > Hi,
> >
> > Is there any contribution(open source contrib module) that routes
> > documents to shards based on document similarity technique? Or any
> > suggestions that integrates mahout to solr for this use case?
> >
> > From what I know, currently there are two document route strategies as
> > explained here
> > https://lucidworks.com/blog/2013/06/13/solr-cloud-document-routing/. But
> > Is there anything else that I'm missing?
> >
> >
> >
> >
> > Thanks.
> >
> >
> >
>

Re: Contrib module for Document Clustering

2016-04-06 Thread davidphilip cherian

Any thoughts?


On Tue, Apr 5, 2016 at 9:05 PM, davidphilip cherian <
davidphilipcher...@gmail.com> wrote:

> Hi,
>
> Is there any contribution(open source contrib module) that routes
> documents to shards based on document similarity technique? Or any
> suggestions that integrates mahout to solr for this use case?
>
> From what I know, currently there are two document route strategies as
> explained here
> https://lucidworks.com/blog/2013/06/13/solr-cloud-document-routing/. But
> Is there anything else that I'm missing?
>
>
>
>
> Thanks.
>
>
>

Re: search design question

2016-04-06 Thread Reth RM

Why not copy the field values of category, title, features, spec into a
common text field and then search on that field. Otherwise use a edismax
query parser and search with user search string on all the above fields may
be by boosting title, category and specs field in order to get relevant
results.
Could you please why do you need to form a query by recognizing the exact
field for each query term?



On Wed, Apr 6, 2016 at 3:07 PM, Binoy Dalal  wrote:

> I understand.
> Although I am not exactly sure how to solve this one, this should serve as
> a helpful starting point:
>
> https://lucidworks.com/resources/webinars/natural-language-search-with-solr/
>
> On Wed, 6 Apr 2016, 11:27 Midas A,  wrote:
>
> > thanks Binoy for replying ,
> >
> > i am giving you few use cases
> >
> > a)  shoes in nike  or nike shoes
> >
> > Here "nike " is brand and in this case  my query  entity is shoe and
> entity
> > type is brand
> >
> > and my result should only pink nike shoes
> >
> >
> > b)  " 32 inch  LCD TV  sony "
> >
> > 32 inch is size ,  LCD is entity type and sony is brand
> >
> >
> > in this case my solr query should be build in different manner to get
> > accurate results .
> >
> >
> >
> >
> > Probably, now u can understand my problem.
> >
> >
> > On Wed, Apr 6, 2016 at 11:12 AM, Binoy Dalal 
> > wrote:
> >
> > > Could you describe your problem in more detail with examples of your
> use
> > > cases.
> > >
> > > On Wed, 6 Apr 2016, 11:03 Midas A,  wrote:
> > >
> > > >  i have to do entity and entity type mapping with help of search
> query
> > > > while building solr query.
> > > >
> > > > how i should i design with the solr  for search.
> > > >
> > > > Please guide me .
> > > >
> > > --
> > > Regards,
> > > Binoy Dalal
> > >
> >
> --
> Regards,
> Binoy Dalal
>

Re: How to use TZ parameter in a query

2016-04-06 Thread Chris Hostetter


Please note the exact description of hte property on the URL you 
mentioned..

"The TZ parameter can be specified to override the default TimeZone (UTC) 
used for the purposes of adding and rounding in date math"

The newer ref guide docs for this param also explain...

https://cwiki.apache.org/confluence/display/solr/Working+with+Dates

"By default, all date math expressions are evaluated relative to the UTC 
TimeZone, but the TZ parameter can be specified to override this 
behaviour, by forcing all date based addition and rounding to be relative 
to the specified time zone."


The TZ param does not change the *format* of the response in XML or JSON, 
which is an ISO standard format that always uses UTC for rendering as a 
string, because it is unambiguious regardless of the client parsing it.   
Just because you might want "date range faceting by day according to 
localtime in denver" doesn't mean your python or perl or javascript code 
for parsing the response will suddenly realize that the string responses 
are sometimes GMT-7 and sometimes GMT-8 (depending on the local daylight 
savings rules in colorado)



-Hoss
http://www.lucidworks.com/

Re: MLT Query Parser

2016-04-06 Thread shamik

Thanks Alessandro, that answers my doubt. in a nutshell, to make MLT Query
parser work, you need to know the document id. I'm just curious as why this
constraint has been added. This will not work for a bulk of use cases. For
e.g. if we are trying to generate MLT based on a text or a keyword, how
would I ever use this API ? My initial impression was that this was designed
to work on a distributed mode.

Now, this adds up a follow-up question as in which one is the right approach
in a solr cloud mode. "mlt"request handler is off the equation since it's
not supported. That leaves with MoreLikeThisComponent which has a known
issue with performance. Is that the only availble solution then ?



--
View this message in context: 
http://lucene.472066.n3.nabble.com/MLT-Query-Parser-for-SolrCloud-tp4268308p4268482.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: CompositId router

I think that's how I would approach it.  I used command-line instead of
rest api to create collection, but I think that just generates rest api
command via curl... so that will be no different as far as I can tell - I'm
just more comfortable on the command line.

Step 8 is the thing I'm not sure about - do you mean import  using a tool
or do you mean simply copy the data directories?

It's worth a try, that's for sure - although this line in one of the URLs I
gave you gives me pause...

Solr 5 has no support for reading Lucene/Solr 3.x and earlier indexes.  Be
sure to run the Lucene IndexUpgrader included with Solr 4.10 if you might
still have old 3x formatted segments in your index. Alternatively: fully
optimize your index with Solr 4.10 to make sure it consists only of one
up-to-date index segment.

What I would do is try it in a test environment and see if I run into
problems before worrying about everything that "could" go wrong...

I'm afraid I can't do you much good on the question of compositeId router -
I haven't dealt with that.  I'm just hoping that a complete recreation of
the solr environment will resolve the problem...


On Wed, Apr 6, 2016 at 10:28 AM, Anuj Lal  wrote:

> Hi John , Shawn
>
> Thanks for replying to my query . Really appreciate your responses
>
> Ideally I’d like to do node by node rolling upgrade from 4.4 to 5.5
>
> But gave this approach of rolling upgrade because I faced  issue with SolrJ
> 4.4 client connecting with 5.5 cluster or 5.5 solar client connecting with
> 4.4 cluster.
>
> I am fine with stopping whole cluster and upgrade all nodes to 5.5.
>
> What I don’t want to do recreate complete index for all clients from source
> system.  At least I’d like to avoid recreate index from source as that will
> be really time consuming process for us.
>
> This is what I have done
> 1. Stop all nodes and zookeeper
>  2. Updated solr 4.4 to 5.5 on all cluster nodes
> 3. Deleted zookeeper data
> 4. Modify solr.xml, solrconfig.xml, created core.properties as per 5.5
> changes
> 5. Boot strap from existing solr home using this command
>
> ./solr-5.5.0/server/scripts/cloud-scripts/zkcli.sh -z  port>  -cmd bootstrap -solrhome /home/solr/solrcloud-shard1 -d  sole con directory> -n  -c 
>
> 6. Start zookeeper
>
> 7. start all solr nodes
>
> all nodes comes up correctly. I can query data fro solr ui as well as from
> our application using solrJ client upgraded to 5.5 libs
>
> What I see that clusterstatus.json and /collections/  shows
> implicit router not compositId router
>
> When I update document, it doesnt go to right shard but different shard
>
> Just for trying purpose, I manually updated clusterstatus.json and
> /collections/ with compositeId router and shards keys
>  after stopping cluster and zookeeper and  restar after update of json
> files
>
>
> Based on your reply, I think suggested steps are ( please confirm if these
> are appropriate)
>
>
> 1. Stop all nodes and zookeeper
>  2. Updated solr 4.4 to 5.5 on all cluster nodes
> 3. Deleted zookeeper data
> 4. Modify solr.xml, solrconfig.xml, created core.properties as per 5.5
> changes
> 5. start zookeeper
> 5. upload config to zookeeper
> 6. Create collection using rest api
> 7. start cluster
> 8. copy collection data from 4.5 to solr 5.5 data directory
>
>
> If you can share upgrade step/process document, that will be great
>
> Thanks
> Anuj
>
>
>
>
>
>
>
> On Apr 6, 2016, at 7:58 AM, John Bickerstaff 
> wrote:
>
> I recently upgraded from 4.x to 5.5 -- it was a pain to figure it out, but
> it turns out to be fairly straightforward...
>
> Caveat: Because I run all my data into Kafka first, I was able to easily
> re-create my collections by running a microservice that pulls from Kafka
> and dumps into Solr.
>
> I have a rough draft document for how to "upgrade by replacement" in this
> way - basically, throw away the older Solr, build a new /chroot in
> Zookeeper and issue a few command line commands to upload configs (from the
> 4.x version) and create new collections in the 5.x version.
>
> If this is what you mean by "bootstrap"  (Upgrade to 5.x but use your
> index/collection/core from a previous version) then perhaps I can help.
>
> There were instructions on the Solr wiki for the differences between 4.x
> and 5.x...
>
> But before I go into a lot more detail, please drop us a line and let me
> know if that's what you need...
>
> For reference you can look at this, but I have more refined instructions
> now:
>
>
> http://stackoverflow.com/questions/34909909/how-to-correctly-add-additional-solr-5-vm-nodes-to-solr-cloud?rq=1
>
> On Wed, Apr 6, 2016 at 7:01 AM, Shawn Heisey  wrote:
>
> On 4/5/2016 3:08 PM, Anuj Lal wrote:
>
> I am new to solr.  Need some advice from more experienced solr  team
>
> members
>
>
> I am upgrading 4.4 solr cluster to 5.5
>
> One of the step I am doing for upgrade is to bootstrap from existing 4.4
>
> solr home ( after upgrading solr installation to 5.5)
>
> We'll need a lot more detail about *exac

Re: CompositId router

2016-04-06 Thread Anuj Lal

Hi John , Shawn

Thanks for replying to my query . Really appreciate your responses

Ideally I’d like to do node by node rolling upgrade from 4.4 to 5.5

But gave this approach of rolling upgrade because I faced issue with SolrJ
4.4 client connecting with 5.5 cluster or 5.5 solar client connecting with
4.4 cluster.

I am fine with stopping whole cluster and upgrade all nodes to 5.5.

What I don’t want to do recreate complete index for all clients from source
system. At least I’d like to avoid recreate index from source as that will
be really time consuming process for us.

This is what I have done
1. Stop all nodes and zookeeper
2. Updated solr 4.4 to 5.5 on all cluster nodes
3. Deleted zookeeper data
4. Modify solr.xml, solrconfig.xml, created core.properties as per 5.5
changes
5. Boot strap from existing solr home using this command

./solr-5.5.0/server/scripts/cloud-scripts/zkcli.sh -z -cmd bootstrap -solrhome /home/solr/solrcloud-shard1 -d -n -c

6. Start zookeeper

7. start all solr nodes

all nodes comes up correctly. I can query data fro solr ui as well as from
our application using solrJ client upgraded to 5.5 libs

What I see that clusterstatus.json and /collections/ shows
implicit router not compositId router

When I update document, it doesnt go to right shard but different shard

Just for trying purpose, I manually updated clusterstatus.json and
/collections/ with compositeId router and shards keys
after stopping cluster and zookeeper and restar after update of json files

Based on your reply, I think suggested steps are ( please confirm if these
are appropriate)

1. Stop all nodes and zookeeper
2. Updated solr 4.4 to 5.5 on all cluster nodes
3. Deleted zookeeper data
4. Modify solr.xml, solrconfig.xml, created core.properties as per 5.5
changes
5. start zookeeper
5. upload config to zookeeper
6. Create collection using rest api
7. start cluster
8. copy collection data from 4.5 to solr 5.5 data directory

If you can share upgrade step/process document, that will be great

Thanks
Anuj

On Apr 6, 2016, at 7:58 AM, John Bickerstaff
wrote:

I recently upgraded from 4.x to 5.5 -- it was a pain to figure it out, but
it turns out to be fairly straightforward...

Caveat: Because I run all my data into Kafka first, I was able to easily
re-create my collections by running a microservice that pulls from Kafka
and dumps into Solr.

I have a rough draft document for how to "upgrade by replacement" in this
way - basically, throw away the older Solr, build a new /chroot in
Zookeeper and issue a few command line commands to upload configs (from the
4.x version) and create new collections in the 5.x version.

If this is what you mean by "bootstrap" (Upgrade to 5.x but use your
index/collection/core from a previous version) then perhaps I can help.

There were instructions on the Solr wiki for the differences between 4.x
and 5.x...

But before I go into a lot more detail, please drop us a line and let me
know if that's what you need...

For reference you can look at this, but I have more refined instructions
now:

http://stackoverflow.com/questions/34909909/how-to-correctly-add-additional-solr-5-vm-nodes-to-solr-cloud?rq=1

On Wed, Apr 6, 2016 at 7:01 AM, Shawn Heisey wrote:

On 4/5/2016 3:08 PM, Anuj Lal wrote:

I am new to solr. Need some advice from more experienced solr team

members

I am upgrading 4.4 solr cluster to 5.5

One of the step I am doing for upgrade is to bootstrap from existing 4.4

solr home ( after upgrading solr installation to 5.5)

We'll need a lot more detail about *exactly* what "bootstrap" means
here. The fact that you chose the word "bootstrap" to describe your
process sets off a red flag in my mind, and might mean that you're doing
this upgrade in a way that won't work like you're expecting it to.

One particular question I'd like to have answered: Are you upgrading
each server in place using the exact same zkHost string as the 4.4
install was using, or are you trying to build up a new ZK
chroot/database using the existing Solr home?

I have not actually performed one of these SolrCloud upgrades, but I
would expect that if you don't connect to the existing zookeeper
database, it would probably behave exactly like you've described. If
I'm completely wrong about how this should be done, somebody please let
me know.

Going from 4.4 to 5.5 is a significant upgrade (including one major
release, 10 minor releases, and 10 bugfix releases) and you might find
that the two versions won't coexist very well. The devs do try to make
different versions compatible, but SolrCloud is evolving at an extremely
rapid pace, so this cannot be guaranteed when the version difference is
so large.

Assuming I have a correct mental picture of how you got to where you
are, there might be some things you could do to get it back on track,
but that would require significant manual fiddling, and no guarantee of
success.

Thanks,
Shawn

Re: MLT Query Parser

2016-04-06 Thread Alessandro Benedetti

Wait a second, and let's avoid any confusion.
We can have different input for a More Like This Request Handler ( if this
is what you were using).

1) the Id of the document we want to find similar documents to
2) a bunch of text

Then you have a lot of parameters that will affect the MLT core.
Specifically the :
mlt.qf=name is telling the MLT to use the field "name" for the MLT query
and the input document.

Let's go back to the query parser...
Your query : "{!mlt qf=name}1" means :
"give me similar documents to the document 1, based only on the field
"name" ".

According to what you wrote : "
Right now,I'm getting mlt documents based on a "keyword"
field"

I think the query you want is simply :
{!mlt qf=keyword}

For MLT query parser the document id is the only input supported.

Cheers

On Wed, Apr 6, 2016 at 12:29 AM, Shamik Bandopadhyay 
wrote:

> Hi,
>
>   I'm trying to use the new MLT query parser in a SolrCloud mode. As per
> the documentation, here's the syntax,
>
> {!mlt qf=name}1
>
> where "1" is the id.
>
> What I'm trying to undertsand is whether "id" is a mandatory field in
> making this work? Right now,I'm getting mlt documents based on a "keyword"
> field. With the new query parser,I'm not able to see a way to use another
> field except for id. Is this a constraint? Or there's a different syntax?
>
> Any pointers will be appreciated.
>
> Thanks,
> Shamik
>

-- 
--

Benedetti Alessandro
Visiting card : http://about.me/alessandro_benedetti

"Tyger, tyger burning bright
In the forests of the night,
What immortal hand or eye
Could frame thy fearful symmetry?"

William Blake - Songs of Experience -1794 England

Re: CompositId router

I'll agree with Shawn too - munging Zookeeper by hand can lead to VERY
unexpected results...

My recommendation would be to start fresh with a new 5.x setup and a new
/chroot in Zookeeper.

(This can be deleted and recreated repeatedly if necessary - I know because
I did... a lot... before I got it right)

Then, pull your data out of your old core/collection either by using the
approach in the link I sent earlier or possibly copying over the core
directory and making the changes to the xml files necessary for 5.x

These links may help you decide how to proceed...

https://cwiki.apache.org/confluence/display/solr/Upgrading+Solr
https://cwiki.apache.org/confluence/display/solr/Upgrading+a+Solr+4.x+Cluster+to+Solr+5.0
https://cwiki.apache.org/confluence/display/solr/Major+Changes+from+Solr+4+to+Solr+5

On Wed, Apr 6, 2016 at 8:58 AM, John Bickerstaff 
wrote:

> I recently upgraded from 4.x to 5.5 -- it was a pain to figure it out, but
> it turns out to be fairly straightforward...
>
> Caveat: Because I run all my data into Kafka first, I was able to easily
> re-create my collections by running a microservice that pulls from Kafka
> and dumps into Solr.
>
> I have a rough draft document for how to "upgrade by replacement" in this
> way - basically, throw away the older Solr, build a new /chroot in
> Zookeeper and issue a few command line commands to upload configs (from the
> 4.x version) and create new collections in the 5.x version.
>
> If this is what you mean by "bootstrap"  (Upgrade to 5.x but use your
> index/collection/core from a previous version) then perhaps I can help.
>
> There were instructions on the Solr wiki for the differences between 4.x
> and 5.x...
>
> But before I go into a lot more detail, please drop us a line and let me
> know if that's what you need...
>
> For reference you can look at this, but I have more refined instructions
> now:
>
>
> http://stackoverflow.com/questions/34909909/how-to-correctly-add-additional-solr-5-vm-nodes-to-solr-cloud?rq=1
>
> On Wed, Apr 6, 2016 at 7:01 AM, Shawn Heisey  wrote:
>
>> On 4/5/2016 3:08 PM, Anuj Lal wrote:
>> > I am new to solr.  Need some advice from more experienced solr  team
>> members
>> >
>> > I am upgrading 4.4 solr cluster to 5.5
>> >
>> > One of the step I am doing for upgrade is to bootstrap from existing
>> 4.4 solr home ( after upgrading solr installation to 5.5)
>>
>> We'll need a lot more detail about *exactly* what "bootstrap" means
>> here.  The fact that you chose the word "bootstrap" to describe your
>> process sets off a red flag in my mind, and might mean that you're doing
>> this upgrade in a way that won't work like you're expecting it to.
>>
>> One particular question I'd like to have answered:  Are you upgrading
>> each server in place using the exact same zkHost string as the 4.4
>> install was using, or are you trying to build up a new ZK
>> chroot/database using the existing Solr home?
>>
>> I have not actually performed one of these SolrCloud upgrades, but I
>> would expect that if you don't connect to the existing zookeeper
>> database, it would probably behave exactly like you've described.  If
>> I'm completely wrong about how this should be done, somebody please let
>> me know.
>>
>> Going from 4.4 to 5.5 is a significant upgrade (including one major
>> release, 10 minor releases, and 10 bugfix releases) and you might find
>> that the two versions won't coexist very well.  The devs do try to make
>> different versions compatible, but SolrCloud is evolving at an extremely
>> rapid pace, so this cannot be guaranteed when the version difference is
>> so large.
>>
>> Assuming I have a correct mental picture of how you got to where you
>> are, there might be some things you could do to get it back on track,
>> but that would require significant manual fiddling, and no guarantee of
>> success.
>>
>> Thanks,
>> Shawn
>>
>>
>

Re: CompositId router

I recently upgraded from 4.x to 5.5 -- it was a pain to figure it out, but
it turns out to be fairly straightforward...

Caveat: Because I run all my data into Kafka first, I was able to easily
re-create my collections by running a microservice that pulls from Kafka
and dumps into Solr.

I have a rough draft document for how to "upgrade by replacement" in this
way - basically, throw away the older Solr, build a new /chroot in
Zookeeper and issue a few command line commands to upload configs (from the
4.x version) and create new collections in the 5.x version.

If this is what you mean by "bootstrap"  (Upgrade to 5.x but use your
index/collection/core from a previous version) then perhaps I can help.

There were instructions on the Solr wiki for the differences between 4.x
and 5.x...

But before I go into a lot more detail, please drop us a line and let me
know if that's what you need...

For reference you can look at this, but I have more refined instructions
now:

http://stackoverflow.com/questions/34909909/how-to-correctly-add-additional-solr-5-vm-nodes-to-solr-cloud?rq=1

On Wed, Apr 6, 2016 at 7:01 AM, Shawn Heisey  wrote:

> On 4/5/2016 3:08 PM, Anuj Lal wrote:
> > I am new to solr.  Need some advice from more experienced solr  team
> members
> >
> > I am upgrading 4.4 solr cluster to 5.5
> >
> > One of the step I am doing for upgrade is to bootstrap from existing 4.4
> solr home ( after upgrading solr installation to 5.5)
>
> We'll need a lot more detail about *exactly* what "bootstrap" means
> here.  The fact that you chose the word "bootstrap" to describe your
> process sets off a red flag in my mind, and might mean that you're doing
> this upgrade in a way that won't work like you're expecting it to.
>
> One particular question I'd like to have answered:  Are you upgrading
> each server in place using the exact same zkHost string as the 4.4
> install was using, or are you trying to build up a new ZK
> chroot/database using the existing Solr home?
>
> I have not actually performed one of these SolrCloud upgrades, but I
> would expect that if you don't connect to the existing zookeeper
> database, it would probably behave exactly like you've described.  If
> I'm completely wrong about how this should be done, somebody please let
> me know.
>
> Going from 4.4 to 5.5 is a significant upgrade (including one major
> release, 10 minor releases, and 10 bugfix releases) and you might find
> that the two versions won't coexist very well.  The devs do try to make
> different versions compatible, but SolrCloud is evolving at an extremely
> rapid pace, so this cannot be guaranteed when the version difference is
> so large.
>
> Assuming I have a correct mental picture of how you got to where you
> are, there might be some things you could do to get it back on track,
> but that would require significant manual fiddling, and no guarantee of
> success.
>
> Thanks,
> Shawn
>
>

RE: BYOPW in security.json

2016-04-06 Thread Davis, Daniel (NIH/NLM) [C]

I'm bordering on development post, but I want to write an Authentication Plugin 
that uses Proxy Authentication and a White List.
So, it will accept a request header such as REMOTE_USER as the username from 
certain hosts, by default 127.0.0.1, ::1.
I also thought about having a whitelist of IPs that are assumed to be "admin", 
to make the CLI more usable.

-Original Message-
From: Jan Høydahl [mailto:jan@cominvent.com] 
Sent: Wednesday, April 06, 2016 4:18 AM
To: solr-user@lucene.apache.org
Subject: Re: BYOPW in security.json

Hi

Note that storing the user names and passwords in security.json is just one 
implementation, to easily get started. It uses the Sha256AuthenticationProvider 
class, which is pluggable. That means that if you require Basic Auth with some 
form of self-service management, you could/should add another 
AuthenticationProvider (implement interface 
BasicAuthPlugin.AuthenticationProvider which e.g. pulls valid users and 
passwords from a database or some other source that you control. Or perhaps 
your organization uses LDAP already, it would be convenient to create an 
LDAPAuthenticationProvider.

I would not recommend adding such complexity to the existing json backed user 
list, although it has the benefit of beting 100% self contained.

--
Jan Høydahl, search solution architect
Cominvent AS - www.cominvent.com

> 18. mar. 2016 kl. 23.30 skrev Oakley, Craig (NIH/NLM/NCBI) [C] 
> :
> 
> When using security.json (in Solr 5.4.1 for instance), is there a recommended 
> method to allow users to change their own passwords? We certainly would not 
> want to grant blanket security-edit to all users; but requiring users to 
> divulge their intended passwords (in Email or by other means) to the 
> administrators of the Solr installation is also arguably less than optimal. 
> It is unclear whether one could setup (for each individual user: "user1" in 
> this example) something like:
> 
> "set-permission": {"name":"edit_pwd_user1", 
> "path":"/admin/authentication", 
> "params":{"command":[set-user],"login":[user1]},
> "role": "edit_pw_user1"}
> "set-user-role": {"user1": ["edit_pw_user1","other","roles","here"]}
> 
> One point that is unclear would be whether "command" and "login" are the 
> correct strings in the third line of the example above: would they instead be 
> "cmd" and "user"? "action" and "username"? something else?
> 
> Even if this worked when implemented for each individual login, it would be 
> nice to be able to say once and for all "every login can edit its own 
> password".
> 
> There could be ways to create a utility which would change the OS-ownership 
> of its own process in order to decrypt a file containing the 
> Solr-admin-password, and to use that to set the password of the Solr login 
> which matched the OS login which initiated the process; but before embarking 
> on developing such a utility, I thought I would ask whether there were other 
> suggestions.

Re: Can't get phrase field boosting to work using edismax

2016-04-06 Thread Jack Krupansky

I haven't traced through all the code recently, so I can't dispute Jan if
he knows a place that checks the output of the pf phrase analysis to see if
it is a single term, but... the INPUT to pf is definitely multiple clauses.
Regardless of the use of the keyword tokenizer, the query parser sees two
tokens, "some" and "words", and passes them as separate clauses to the code
I referenced above, which constructs quoted phrases and passes them through
the query parser again for the pf fields. What happens after that I cannot
say for sure.

But if the pf post-analysis processing does have this limitation that the
analysis of a multi-word phrase must be at least two terms, it should be
clearly documented. That's essentially what is at stake in this particular
issue.

Granted, that was my first thought, that the use of the keyword tokenizer
would be a no-no for a pf field, but this particular use case seems valid
to me, so we should consider whether the "multiple words analyze to one
term" use case should be supported, for precisely the use case at hand.

I can see wanting to have both a multi-term pf field combined with a
single-term pf field with the latter having a higher boost. For example, if
the input query exactly matches a product name field, as opposed to simply
matching a subset of a longer product name.


-- Jack Krupansky

On Wed, Apr 6, 2016 at 5:22 AM,  wrote:

> OK, well I'm not sure I agree with you. First of all, you ask me to point
> my "pf" towards a tokenized field, but I already do that (the fact that all
> text is tokenized into a single token doesn't change that fact). Also, I
> don't agree with the view that a single term phrase never is
> valid/reasonable. In this specific case, with a KeywordTokenizer, I see it
> as very reasonable indeed. And I would consider a "single term keyword
> phrase" solution more logical than a workaround using special magical
> characters inserted in the text. Just my two cents... :)
>
> Oh, hang on... If a phrase is defined as multiple tokens, and pf is used
> for phrase  boosting, does that mean that even with a regular tokenizer the
> pf won't work for fields that only contain one word? For example if the
> title of one document is "John", and the user searches for 'John' (without
> any surrounding phrase-characters), will edismax not boost this document?
>
> /Jimi
>
> -Original Message-
> From: Jan Høydahl [mailto:jan@cominvent.com]
> Sent: Wednesday, April 6, 2016 10:43 AM
> To: solr-user@lucene.apache.org
> Subject: Re: Can't get phrase field boosting to work using edismax
>
> Hi,
>
> Phrase match via “pf” requires the target field to contain a phrase. A
> phrase is defined as multiple tokens. Yours does not contain a phrase since
> you use the KeywordTokenizer, leaving only one token in the field. eDismax
> pf will thus never kick in. Please point your “pf” towards a tokenized
> field.
>
> If what you are trying to achieve is to boost only when the whole query
> exactly matches the full content of the field, then have a look at my
> solution here https://github.com/cominvent/exactmatch
>
> --
> Jan Høydahl, search solution architect
> Cominvent AS - www.cominvent.com
>
> > 5. apr. 2016 kl. 19.10 skrev jimi.hulleg...@svensktnaringsliv.se:
> >
> > Some more input, before I call it a day. Just for the heck of it, I
> tried changing minClauseSize to 0 using the Eclipse debugger, so that it
> didn't return null at line 1203, but instead returned the TermQuery on line
> 1205. Then everything worked exactly as it should. The matching document
> got boosted as expected. And in the explain output, this can be seen:
> >
> > [...]
> > 11.274228 = (MATCH) weight(exactTitle:some words^100.0 in 172)
> [DefaultSimilarity], result of:
> > [...]
> >
> > So. In my case, having minClauseSize=2 on line 550 (line 565 for solr
> 5.5.0) is the culprit. Is this a bug, or am I using the pf in the wrong
> way? Can someone explain why minClauseSize can't be set to 0 here? The
> comment simply states "we need at least two or there shouldn't be a boost",
> but no explaination *why* at least two is needed.
> >
> > Regards
> > /Jimi
> >
> > -Original Message-
> > From: jimi.hulleg...@svensktnaringsliv.se
> > [mailto:jimi.hulleg...@svensktnaringsliv.se]
> > Sent: Tuesday, April 5, 2016 6:51 PM
> > To: solr-user@lucene.apache.org
> > Subject: RE: Can't get phrase field boosting to work using edismax
> >
> > I now used the Eclipse debugger, to try and see if I can understand what
> is happening, I it seems like the ExtendedDismaxQParser simply ignores my
> pf parameter, since it doesn't interpret it as a phrase query.
> >
> > https://github.com/apache/lucene-solr/blob/releases/lucene-solr/4.6.0/
> > solr/core/src/java/org/apache/solr/search/ExtendedDismaxQParser.java
> >
> > On line 1180 I get a query object of type TermQuery (with the term
> "exactTitle:some words"). And in the if statements starting at line it is
> quite clear that if it is not a PhraseQuery or a M

Re: Can't get phrase field boosting to work using edismax

On 4/6/2016 7:13 AM, jimi.hulleg...@svensktnaringsliv.se wrote:
> Ah, thanks. It never occurred to me that clicking on the text "Create" would 
> give me a different result compared to clicking on the arrow. In my mind, 
> "Create" was simply the label, and the arrow indicating a dropdown option for 
> "things to create".

Indeed, that's a little confusing.

I found Atlassian's Jira install for Jira itself, and let them know
about the confusing UI.  Amusing tidbit:  My report is a Service Desk
request.

Thanks,
Shawn

Re: Solr 5.5.0: SearchHandler: Appending a Join query

2016-04-06 Thread Mikhail Khludnev

I suppose q= is singular param doesn't accept multiple values.

On Wed, Apr 6, 2016 at 1:01 PM, Anand Chandrashekar 
wrote:

> Greetings.
>
> 1) A join query creates an array of "q" parameter. For example, the query
>
>
> http://localhost:8983/solr/gettingstarted/select?q=projectdata%3A%22top+secret+data2%22&q=%7B!join+from=locatorByUser+to=locator%7Dusers=joe
>
> creates the following array elements for the "q" parameter.
>
> [array entry #1] projectdata:"top secret data2"
> [array entry #2] {!join from=locatorByUser to=locator}users=joe
>
> 2) I would like to enforce the join part as a mandatory parameter with the
> "users" field added programmatically. I have extended the search handler,
> and am mimicking the array entry # 2 and adding it to the SolrParams.
>
> Pseudocode handleRequestBody:
> ModifiableSolrParams modParams=new
> ModifiableSolrParams(request.getParams());
> modParams.set("q",...);//adding the join (array entry # 2) part and the
> original query
> request.setParams(modParams);
> super.handleRequestBody(request, response);
>
> I am able to mimic the exact array, but the query does not enforce the
> join. Seems to only pick the first entry. Any advice/suggestions?
>
> Thanks and regards.
> Anand.
>



-- 
Sincerely yours
Mikhail Khludnev
Principal Engineer,
Grid Dynamics

RE: Can't get phrase field boosting to work using edismax

On Wednesday, April 6, 2016 2:50 PM, apa...@elyograg.org wrote:
> 
> If you can only create a service desk request, then you might be clicking the 
> "Service Desk" menu item, 
> or maybe you're clicking the little down arrow on the right side of the big 
> red "Create" button.  
> Try clicking the main (left) part of the Create button.
> 
> https://www.dropbox.com/s/u8tq8v9qvb0aq0z/solr-issue-create.png?dl=0

Ah, thanks. It never occurred to me that clicking on the text "Create" would 
give me a different result compared to clicking on the arrow. In my mind, 
"Create" was simply the label, and the arrow indicating a dropdown option for 
"things to create".

/Jimi

RE: BYOPW in security.json

2016-04-06 Thread Oakley, Craig (NIH/NLM/NCBI) [C]

Thanks.

I googled to look for examples of how to proceed, and notice that you opened 
SOLR-8951

Thanks again

-Original Message-
From: Jan Høydahl [mailto:jan@cominvent.com] 
Sent: Wednesday, April 06, 2016 4:18 AM
To: solr-user@lucene.apache.org
Subject: Re: BYOPW in security.json

Hi

Note that storing the user names and passwords in security.json is just one 
implementation, to easily get started. It uses the Sha256AuthenticationProvider 
class, which is pluggable. That means that if you require Basic Auth with some 
form of self-service management, you could/should add another 
AuthenticationProvider (implement interface 
BasicAuthPlugin.AuthenticationProvider which e.g. pulls valid users and 
passwords from a database or some other source that you control. Or perhaps 
your organization uses LDAP already, it would be convenient to create an 
LDAPAuthenticationProvider.

I would not recommend adding such complexity to the existing json backed user 
list, although it has the benefit of beting 100% self contained.

--
Jan Høydahl, search solution architect
Cominvent AS - www.cominvent.com

> 18. mar. 2016 kl. 23.30 skrev Oakley, Craig (NIH/NLM/NCBI) [C] 
> :
> 
> When using security.json (in Solr 5.4.1 for instance), is there a recommended 
> method to allow users to change their own passwords? We certainly would not 
> want to grant blanket security-edit to all users; but requiring users to 
> divulge their intended passwords (in Email or by other means) to the 
> administrators of the Solr installation is also arguably less than optimal. 
> It is unclear whether one could setup (for each individual user: "user1" in 
> this example) something like:
> 
> "set-permission": {"name":"edit_pwd_user1",
> "path":"/admin/authentication",
> "params":{"command":[set-user],"login":[user1]},
> "role": "edit_pw_user1"}
> "set-user-role": {"user1": ["edit_pw_user1","other","roles","here"]}
> 
> One point that is unclear would be whether "command" and "login" are the 
> correct strings in the third line of the example above: would they instead be 
> "cmd" and "user"? "action" and "username"? something else?
> 
> Even if this worked when implemented for each individual login, it would be 
> nice to be able to say once and for all "every login can edit its own 
> password".
> 
> There could be ways to create a utility which would change the OS-ownership 
> of its own process in order to decrypt a file containing the 
> Solr-admin-password, and to use that to set the password of the Solr login 
> which matched the OS login which initiated the process; but before embarking 
> on developing such a utility, I thought I would ask whether there were other 
> suggestions.

Re: CompositId router

On 4/5/2016 3:08 PM, Anuj Lal wrote:
> I am new to solr.  Need some advice from more experienced solr  team members
>
> I am upgrading 4.4 solr cluster to 5.5
>
> One of the step I am doing for upgrade is to bootstrap from existing 4.4 solr 
> home ( after upgrading solr installation to 5.5)

We'll need a lot more detail about *exactly* what "bootstrap" means
here.  The fact that you chose the word "bootstrap" to describe your
process sets off a red flag in my mind, and might mean that you're doing
this upgrade in a way that won't work like you're expecting it to.

One particular question I'd like to have answered:  Are you upgrading
each server in place using the exact same zkHost string as the 4.4
install was using, or are you trying to build up a new ZK
chroot/database using the existing Solr home?

I have not actually performed one of these SolrCloud upgrades, but I
would expect that if you don't connect to the existing zookeeper
database, it would probably behave exactly like you've described.  If
I'm completely wrong about how this should be done, somebody please let
me know.

Going from 4.4 to 5.5 is a significant upgrade (including one major
release, 10 minor releases, and 10 bugfix releases) and you might find
that the two versions won't coexist very well.  The devs do try to make
different versions compatible, but SolrCloud is evolving at an extremely
rapid pace, so this cannot be guaranteed when the version difference is
so large.

Assuming I have a correct mental picture of how you got to where you
are, there might be some things you could do to get it back on track,
but that would require significant manual fiddling, and no guarantee of
success.

Thanks,
Shawn

Re: Can't get phrase field boosting to work using edismax

On 4/6/2016 2:35 AM, jimi.hulleg...@svensktnaringsliv.se wrote:
> I guess I can conclude that this is a bug. But I wasn't able to report it in 
> Jira. I just got to some servicedesk form 
> (https://issues.apache.org/jira/servicedesk/customer/portal/5/create/27) that 
> didn't seem related to solr in any way, (the affects/fix version fields 
> didn't correspond to any solr version I have heard of). 
>
> Can't a newly created jira user create bug issues straight away? If so, 
> where/how exactly?

If you can only create a service desk request, then you might be
clicking the "Service Desk" menu item, or maybe you're clicking the
little down arrow on the right side of the big red "Create" button.  Try
clicking the main (left) part of the Create button.

https://www.dropbox.com/s/u8tq8v9qvb0aq0z/solr-issue-create.png?dl=0

Thanks,
Shawn

Re: Solr 5.5.0: SearchHandler: Appending a Join query

2016-04-06 Thread Stefan Matheis

Anand,

have a look at the example schema, there is a section that explains
"invariants" which could be one solution to your question.

-Stefan

On Wed, Apr 6, 2016 at 12:01 PM, Anand Chandrashekar
 wrote:
> Greetings.
>
> 1) A join query creates an array of "q" parameter. For example, the query
>
> http://localhost:8983/solr/gettingstarted/select?q=projectdata%3A%22top+secret+data2%22&q=%7B!join+from=locatorByUser+to=locator%7Dusers=joe
>
> creates the following array elements for the "q" parameter.
>
> [array entry #1] projectdata:"top secret data2"
> [array entry #2] {!join from=locatorByUser to=locator}users=joe
>
> 2) I would like to enforce the join part as a mandatory parameter with the
> "users" field added programmatically. I have extended the search handler,
> and am mimicking the array entry # 2 and adding it to the SolrParams.
>
> Pseudocode handleRequestBody:
> ModifiableSolrParams modParams=new
> ModifiableSolrParams(request.getParams());
> modParams.set("q",...);//adding the join (array entry # 2) part and the
> original query
> request.setParams(modParams);
> super.handleRequestBody(request, response);
>
> I am able to mimic the exact array, but the query does not enforce the
> join. Seems to only pick the first entry. Any advice/suggestions?
>
> Thanks and regards.
> Anand.

Re: Can't get phrase field boosting to work using edismax

2016-04-06 Thread Jan Høydahl

> Oh, hang on... If a phrase is defined as multiple tokens, and pf is used for 
> phrase  boosting, does that mean that even with a regular tokenizer the pf 
> won't work for fields that only contain one word? For example if the title of 
> one document is "John", and the user searches for 'John' (without any 
> surrounding phrase-characters), will edismax not boost this document?

Yes, phrase boost “pf” is only applied if the user enters a phrase. Thus q=john 
will not trigger pf, since there is no phrase to boost.
My workaround, however, inserts a special token before and after both the 
indexed field and the query, so there will always be 3 or more tokens, and pf 
will kick in. You could use variations of this to have single word queries 
trigger pf boost for text in a field even if it is not an exact match.

But I agree with you that it is not very obvious, and could be better 
documented.
Could perhaps also be useful with a new edismax parameter “pfMinClauseSize” to 
force pf on single-token without this workaround. But there could be good 
reasons for the original design choice here, that we don’t know about...

--
Jan Høydahl, search solution architect
Cominvent AS - www.cominvent.com

> 6. apr. 2016 kl. 11.22 skrev jimi.hulleg...@svensktnaringsliv.se:
> 
> OK, well I'm not sure I agree with you. First of all, you ask me to point my 
> "pf" towards a tokenized field, but I already do that (the fact that all text 
> is tokenized into a single token doesn't change that fact). Also, I don't 
> agree with the view that a single term phrase never is valid/reasonable. In 
> this specific case, with a KeywordTokenizer, I see it as very reasonable 
> indeed. And I would consider a "single term keyword phrase" solution more 
> logical than a workaround using special magical characters inserted in the 
> text. Just my two cents... :)
> 
> Oh, hang on... If a phrase is defined as multiple tokens, and pf is used for 
> phrase  boosting, does that mean that even with a regular tokenizer the pf 
> won't work for fields that only contain one word? For example if the title of 
> one document is "John", and the user searches for 'John' (without any 
> surrounding phrase-characters), will edismax not boost this document?
> 
> /Jimi
> 
> -Original Message-
> From: Jan Høydahl [mailto:jan@cominvent.com] 
> Sent: Wednesday, April 6, 2016 10:43 AM
> To: solr-user@lucene.apache.org
> Subject: Re: Can't get phrase field boosting to work using edismax
> 
> Hi,
> 
> Phrase match via “pf” requires the target field to contain a phrase. A phrase 
> is defined as multiple tokens. Yours does not contain a phrase since you use 
> the KeywordTokenizer, leaving only one token in the field. eDismax pf will 
> thus never kick in. Please point your “pf” towards a tokenized field.
> 
> If what you are trying to achieve is to boost only when the whole query 
> exactly matches the full content of the field, then have a look at my 
> solution here https://github.com/cominvent/exactmatch
> 
> --
> Jan Høydahl, search solution architect
> Cominvent AS - www.cominvent.com
> 
>> 5. apr. 2016 kl. 19.10 skrev jimi.hulleg...@svensktnaringsliv.se:
>> 
>> Some more input, before I call it a day. Just for the heck of it, I tried 
>> changing minClauseSize to 0 using the Eclipse debugger, so that it didn't 
>> return null at line 1203, but instead returned the TermQuery on line 1205. 
>> Then everything worked exactly as it should. The matching document got 
>> boosted as expected. And in the explain output, this can be seen:
>> 
>> [...]
>> 11.274228 = (MATCH) weight(exactTitle:some words^100.0 in 172) 
>> [DefaultSimilarity], result of:
>> [...]
>> 
>> So. In my case, having minClauseSize=2 on line 550 (line 565 for solr 5.5.0) 
>> is the culprit. Is this a bug, or am I using the pf in the wrong way? Can 
>> someone explain why minClauseSize can't be set to 0 here? The comment simply 
>> states "we need at least two or there shouldn't be a boost", but no 
>> explaination *why* at least two is needed.
>> 
>> Regards
>> /Jimi
>> 
>> -Original Message-
>> From: jimi.hulleg...@svensktnaringsliv.se 
>> [mailto:jimi.hulleg...@svensktnaringsliv.se]
>> Sent: Tuesday, April 5, 2016 6:51 PM
>> To: solr-user@lucene.apache.org
>> Subject: RE: Can't get phrase field boosting to work using edismax
>> 
>> I now used the Eclipse debugger, to try and see if I can understand what is 
>> happening, I it seems like the ExtendedDismaxQParser simply ignores my pf 
>> parameter, since it doesn't interpret it as a phrase query.
>> 
>> https://github.com/apache/lucene-solr/blob/releases/lucene-solr/4.6.0/
>> solr/core/src/java/org/apache/solr/search/ExtendedDismaxQParser.java
>> 
>> On line 1180 I get a query object of type TermQuery (with the term 
>> "exactTitle:some words"). And in the if statements starting at line it is 
>> quite clear that if it is not a PhraseQuery or a MultiPhraseQuery, or if the 
>> minClauseSize > 1 (and it is set to 2 on line

Re: Update Speed: QTime 1,000 - 5,000

2016-04-06 Thread Alessandro Benedetti

On Wed, Apr 6, 2016 at 7:53 AM, Robert Brown  wrote:

> The QTime's are from the updates.
>
> We don't have the resource right now to switch to SolrJ, but I would
> assume only sending updates to the leaders would take some redirects out of
> the process,

How do you route your documents now ?
Aren't you using Solr routing ?

> I can regularly query for the collection status to know who's who.
>
> I'm now more interested in the caches that are thrown away on softCommit,
> since we do see some performance issues on queries too. Would these caches
> affect querying and faceting?
>

You should check your caches stats and performances.
Filter Cache could be heavily involved in querying and faceting .
Query Result Cache as the name says woul affect the query results fetching
as well.
Document cache will impact in fetching what you display for the documents.
Much more could be discussed about caching, a good start would be to verify
how currently your caches are configured and how they are currently
performing.

Cheers

>
> Thanks,
> Rob
>
>
>
>
> On 06/04/16 00:41, Erick Erickson wrote:
>
>> bq: Apart from the obvious delay, I'm also seeing QTime's of 1,000 to
>> 5,000
>>
>> QTimes for what? The update? Queries? If for queries, autowarming may
>> help,
>> especially as your soft commit is throwing away all the top-level
>> caches (i.e. the
>> ones configured in solrconfig.xml) every minute. It shouldn't be that bad
>> on the
>> lower-level Lucene caches though, at least the per-segment ones.
>>
>> You'll get some improvement by using SolrJ (with CloudSolrClient)
>> rather than cURL.
>> no matter which node you hit, about half your documents will have to
>> be forwarded to
>> the other shard when using cURL, whereas SolrJ (with CloudSolrClient)
>> will route the docs
>> to the correct leader right from the client.
>>
>> Best,
>> Erick
>>
>> On Tue, Apr 5, 2016 at 2:53 PM, John Bickerstaff
>>  wrote:
>>
>>> A few thoughts...
>>>
>>>  From a black-box testing perspective, you might try changing that
>>> softCommit time frame  to something longer and see if it makes a
>>> difference.
>>>
>>> The size of  your documents will make a difference too - so the
>>> comparison
>>> to 300 - 500 on other cloud setups may or may not be comparing apples to
>>> oranges...
>>>
>>> Are the "new" documents actually new or are you overwriting existing solr
>>> doc ID's?  If you are overwriting, you may want to optimize and see if
>>> that
>>> helps.
>>>
>>>
>>>
>>> On Tue, Apr 5, 2016 at 2:38 PM, Robert Brown 
>>> wrote:
>>>
>>> Hi,

 I'm currently posting updates via cURL, in batches of 1,000 docs in JSON
 files.

 My setup consists of 2 shards, 1 replica each, 50m docs in total.

 These updates are hitting a node at random, from a server across the
 Internet.

 Apart from the obvious delay, I'm also seeing QTime's of 1,000 to 5,000.

 This strikes me as quite high since I also sometimes see times of around
 300-500, on similar cloud setups.

 The setup is running on VMs with rotary disks, and enough RAM to hold
 roughly half the entire index in disk cache (I'm in the process of
 upgrading this).

 I hard commit every 10 minutes but don't open a new searcher, just to
 make
 sure data is "safe".  I softCommit every 1 minute to make data
 available.

 Are there any obvious things I can do to improve my situation?

 Thanks,
 Rob

>

-- 
--

Benedetti Alessandro
Visiting card : http://about.me/alessandro_benedetti

"Tyger, tyger burning bright
In the forests of the night,
What immortal hand or eye
Could frame thy fearful symmetry?"

William Blake - Songs of Experience -1794 England

Re: How to use TZ parameter in a query

2016-04-06 Thread Alessandro Benedetti

At the moment, the tz parameter will be used to calculate the UTC date in
the query, based on the tz supplied.
In the index the dates are in UTC.
To show the dates in the same timezone we query, we should implement a
DocTransformer[1] .
This DocTransformer will check for all ( or a subset) of date field, and
render them in the input time-zone .
Would be a really nice addition, I encourage people to contribute it, can
be really nice completion for the tz parameter :)
https://issues.apache.org/jira/browse/SOLR-8952

Will do myself in the next month, it is pretty simple change, but everyone
is encouraged to contribute it :)

[1]
https://cwiki.apache.org/confluence/display/solr/Transforming+Result+Documents

On Wed, Apr 6, 2016 at 10:57 AM, Bogdan Marinescu <
bogdan.marine...@awinta.com> wrote:

> I understand. Would be nice though :)
>
> Thanks.
>
>
> On 04/06/2016 11:26 AM, jimi.hulleg...@svensktnaringsliv.se wrote:
>
>> I think that this parameter is only used to interpret the dates provided
>> in the query, like query filters. At least that is how I interpret the wiki
>> text. Your interpretation makes more sense in general though, it would be
>> nice if it was possible to modify the timezone for both the query and the
>> result.
>>
>> /Jimi
>>
>> -Original Message-
>> From: Bogdan Marinescu [mailto:bogdan.marine...@awinta.com]
>> Sent: Wednesday, April 6, 2016 11:20 AM
>> To: solr-user@lucene.apache.org
>> Subject: How to use TZ parameter in a query
>>
>> Hi,
>>
>> According to the wiki
>> https://wiki.apache.org/solr/CoreQueryParameters#TZ I can use the TZ
>> param to specify the timezone.
>> I tried to make a query and put in the raw section TZ=Europe/Berlin or
>> any other found in
>> https://en.wikipedia.org/wiki/List_of_tz_database_time_zones but no
>> luck. The date that I get back is still in UTC format.
>>
>> Any ideas what I'm doing wrong?
>>
>> Thanks
>>
>
>


-- 
--

Benedetti Alessandro
Visiting card : http://about.me/alessandro_benedetti

"Tyger, tyger burning bright
In the forests of the night,
What immortal hand or eye
Could frame thy fearful symmetry?"

William Blake - Songs of Experience -1794 England

Solr 5.5.0: SearchHandler: Appending a Join query

2016-04-06 Thread Anand Chandrashekar

Greetings.

1) A join query creates an array of "q" parameter. For example, the query

http://localhost:8983/solr/gettingstarted/select?q=projectdata%3A%22top+secret+data2%22&q=%7B!join+from=locatorByUser+to=locator%7Dusers=joe

creates the following array elements for the "q" parameter.

[array entry #1] projectdata:"top secret data2"
[array entry #2] {!join from=locatorByUser to=locator}users=joe

2) I would like to enforce the join part as a mandatory parameter with the
"users" field added programmatically. I have extended the search handler,
and am mimicking the array entry # 2 and adding it to the SolrParams.

Pseudocode handleRequestBody:
ModifiableSolrParams modParams=new
ModifiableSolrParams(request.getParams());
modParams.set("q",...);//adding the join (array entry # 2) part and the
original query
request.setParams(modParams);
super.handleRequestBody(request, response);

I am able to mimic the exact array, but the query does not enforce the
join. Seems to only pick the first entry. Any advice/suggestions?

Thanks and regards.
Anand.

Re: How to use TZ parameter in a query

2016-04-06 Thread Bogdan Marinescu


I understand. Would be nice though :)

Thanks.

On 04/06/2016 11:26 AM, jimi.hulleg...@svensktnaringsliv.se wrote:

I think that this parameter is only used to interpret the dates provided in the 
query, like query filters. At least that is how I interpret the wiki text. Your 
interpretation makes more sense in general though, it would be nice if it was 
possible to modify the timezone for both the query and the result.

/Jimi

-Original Message-
From: Bogdan Marinescu [mailto:bogdan.marine...@awinta.com]
Sent: Wednesday, April 6, 2016 11:20 AM
To: solr-user@lucene.apache.org
Subject: How to use TZ parameter in a query

Hi,

According to the wiki
https://wiki.apache.org/solr/CoreQueryParameters#TZ I can use the TZ param to 
specify the timezone.
I tried to make a query and put in the raw section TZ=Europe/Berlin or any 
other found in https://en.wikipedia.org/wiki/List_of_tz_database_time_zones but 
no luck. The date that I get back is still in UTC format.

Any ideas what I'm doing wrong?

Thanks

Re: search design question

2016-04-06 Thread Binoy Dalal

I understand.
Although I am not exactly sure how to solve this one, this should serve as
a helpful starting point:
https://lucidworks.com/resources/webinars/natural-language-search-with-solr/

On Wed, 6 Apr 2016, 11:27 Midas A,  wrote:

> thanks Binoy for replying ,
>
> i am giving you few use cases
>
> a)  shoes in nike  or nike shoes
>
> Here "nike " is brand and in this case  my query  entity is shoe and entity
> type is brand
>
> and my result should only pink nike shoes
>
>
> b)  " 32 inch  LCD TV  sony "
>
> 32 inch is size ,  LCD is entity type and sony is brand
>
>
> in this case my solr query should be build in different manner to get
> accurate results .
>
>
>
>
> Probably, now u can understand my problem.
>
>
> On Wed, Apr 6, 2016 at 11:12 AM, Binoy Dalal 
> wrote:
>
> > Could you describe your problem in more detail with examples of your use
> > cases.
> >
> > On Wed, 6 Apr 2016, 11:03 Midas A,  wrote:
> >
> > >  i have to do entity and entity type mapping with help of search query
> > > while building solr query.
> > >
> > > how i should i design with the solr  for search.
> > >
> > > Please guide me .
> > >
> > --
> > Regards,
> > Binoy Dalal
> >
>
-- 
Regards,
Binoy Dalal

Saving Solr filter query.

2016-04-06 Thread Pritam Kute

Hi,

I have designed one web page on which user can search and filter his data
based on some term facets. I am using Apache Solr 5.3.1 for the same. It is
working perfectly fine.

Now my requirement is to save the query which I have executed on Solr, so,
in future, if I need to search the same results, I have to just extract the
saved query and make a query to Solr server (I mean the feature like saving
the favorite filters).

Any help would be useful. Thanks in advance.

Thanks & Regards,
--
*Pritam Kute*

RE: How to use TZ parameter in a query

I think that this parameter is only used to interpret the dates provided in the 
query, like query filters. At least that is how I interpret the wiki text. Your 
interpretation makes more sense in general though, it would be nice if it was 
possible to modify the timezone for both the query and the result.

/Jimi

-Original Message-
From: Bogdan Marinescu [mailto:bogdan.marine...@awinta.com] 
Sent: Wednesday, April 6, 2016 11:20 AM
To: solr-user@lucene.apache.org
Subject: How to use TZ parameter in a query

Hi,

According to the wiki
https://wiki.apache.org/solr/CoreQueryParameters#TZ I can use the TZ param to 
specify the timezone.
I tried to make a query and put in the raw section TZ=Europe/Berlin or any 
other found in https://en.wikipedia.org/wiki/List_of_tz_database_time_zones but 
no luck. The date that I get back is still in UTC format.

Any ideas what I'm doing wrong?

Thanks

RE: Can't get phrase field boosting to work using edismax

OK, well I'm not sure I agree with you. First of all, you ask me to point my 
"pf" towards a tokenized field, but I already do that (the fact that all text 
is tokenized into a single token doesn't change that fact). Also, I don't agree 
with the view that a single term phrase never is valid/reasonable. In this 
specific case, with a KeywordTokenizer, I see it as very reasonable indeed. And 
I would consider a "single term keyword phrase" solution more logical than a 
workaround using special magical characters inserted in the text. Just my two 
cents... :)

Oh, hang on... If a phrase is defined as multiple tokens, and pf is used for 
phrase  boosting, does that mean that even with a regular tokenizer the pf 
won't work for fields that only contain one word? For example if the title of 
one document is "John", and the user searches for 'John' (without any 
surrounding phrase-characters), will edismax not boost this document?

/Jimi

-Original Message-
From: Jan Høydahl [mailto:jan@cominvent.com] 
Sent: Wednesday, April 6, 2016 10:43 AM
To: solr-user@lucene.apache.org
Subject: Re: Can't get phrase field boosting to work using edismax

Hi,

Phrase match via “pf” requires the target field to contain a phrase. A phrase 
is defined as multiple tokens. Yours does not contain a phrase since you use 
the KeywordTokenizer, leaving only one token in the field. eDismax pf will thus 
never kick in. Please point your “pf” towards a tokenized field.

If what you are trying to achieve is to boost only when the whole query exactly 
matches the full content of the field, then have a look at my solution here 
https://github.com/cominvent/exactmatch

--
Jan Høydahl, search solution architect
Cominvent AS - www.cominvent.com

> 5. apr. 2016 kl. 19.10 skrev jimi.hulleg...@svensktnaringsliv.se:
> 
> Some more input, before I call it a day. Just for the heck of it, I tried 
> changing minClauseSize to 0 using the Eclipse debugger, so that it didn't 
> return null at line 1203, but instead returned the TermQuery on line 1205. 
> Then everything worked exactly as it should. The matching document got 
> boosted as expected. And in the explain output, this can be seen:
> 
> [...]
> 11.274228 = (MATCH) weight(exactTitle:some words^100.0 in 172) 
> [DefaultSimilarity], result of:
> [...]
> 
> So. In my case, having minClauseSize=2 on line 550 (line 565 for solr 5.5.0) 
> is the culprit. Is this a bug, or am I using the pf in the wrong way? Can 
> someone explain why minClauseSize can't be set to 0 here? The comment simply 
> states "we need at least two or there shouldn't be a boost", but no 
> explaination *why* at least two is needed.
> 
> Regards
> /Jimi
> 
> -Original Message-
> From: jimi.hulleg...@svensktnaringsliv.se 
> [mailto:jimi.hulleg...@svensktnaringsliv.se]
> Sent: Tuesday, April 5, 2016 6:51 PM
> To: solr-user@lucene.apache.org
> Subject: RE: Can't get phrase field boosting to work using edismax
> 
> I now used the Eclipse debugger, to try and see if I can understand what is 
> happening, I it seems like the ExtendedDismaxQParser simply ignores my pf 
> parameter, since it doesn't interpret it as a phrase query.
> 
> https://github.com/apache/lucene-solr/blob/releases/lucene-solr/4.6.0/
> solr/core/src/java/org/apache/solr/search/ExtendedDismaxQParser.java
> 
> On line 1180 I get a query object of type TermQuery (with the term 
> "exactTitle:some words"). And in the if statements starting at line it is 
> quite clear that if it is not a PhraseQuery or a MultiPhraseQuery, or if the 
> minClauseSize > 1 (and it is set to 2 on line 550) the method simply returns 
> null (ie ignoring my pf parameter). Why is this happening?
> 
> I use Solr 4.6 by the way... I forgot to mention that in my original message.
> 
> 
> -Original Message-
> From: jimi.hulleg...@svensktnaringsliv.se 
> [mailto:jimi.hulleg...@svensktnaringsliv.se]
> Sent: Tuesday, April 5, 2016 5:36 PM
> To: solr-user@lucene.apache.org
> Subject: RE: Can't get phrase field boosting to work using edismax
> 
> OK. Interesting. But... I added a solr.TrimFilterFactory at the end of my 
> analyzer definition. Shouldn't that take care of the added space at the end? 
> The admin analysis page indicates that it works as it should, but I still 
> can't get edismax to boost.
> 
> -Original Message-
> From: Jack Krupansky [mailto:jack.krupan...@gmail.com]
> Sent: Tuesday, April 5, 2016 4:42 PM
> To: solr-user@lucene.apache.org
> Subject: Re: Can't get phrase field boosting to work using edismax
> 
> It looks like the code constructing the boost phrase for pf will always add a 
> trailing blank, which is never a problem when a normal tokenizer is used that 
> removes white space, but the keyword tokenizer will preserve that extra 
> space, which prevents an exact match.
> 
> See line 531:
> https://github.com/apache/lucene-solr/blob/releases/lucene-solr/5.5.0/
> solr/core/src/java/org/apache/solr/search/ExtendedDismaxQParser.java
> 
> I'd say

How to use TZ parameter in a query

2016-04-06 Thread Bogdan Marinescu


Hi,

According to the wiki 
https://wiki.apache.org/solr/CoreQueryParameters#TZ I can use the TZ 
param to specify the timezone.
I tried to make a query and put in the raw section TZ=Europe/Berlin or 
any other found in 
https://en.wikipedia.org/wiki/List_of_tz_database_time_zones but no 
luck. The date that I get back is still in UTC format.


Any ideas what I'm doing wrong?

Thanks

Re: Can't get phrase field boosting to work using edismax

2016-04-06 Thread Jan Høydahl

Hi,

Phrase match via “pf” requires the target field to contain a phrase. A phrase 
is defined as multiple tokens. Yours does not contain a phrase since you use 
the KeywordTokenizer, leaving only one token in the field. eDismax pf will thus 
never kick in. Please point your “pf” towards a tokenized field.

If what you are trying to achieve is to boost only when the whole query exactly 
matches the full content of the field, then have a look at my solution here 
https://github.com/cominvent/exactmatch

--
Jan Høydahl, search solution architect
Cominvent AS - www.cominvent.com

> 5. apr. 2016 kl. 19.10 skrev jimi.hulleg...@svensktnaringsliv.se:
> 
> Some more input, before I call it a day. Just for the heck of it, I tried 
> changing minClauseSize to 0 using the Eclipse debugger, so that it didn't 
> return null at line 1203, but instead returned the TermQuery on line 1205. 
> Then everything worked exactly as it should. The matching document got 
> boosted as expected. And in the explain output, this can be seen:
> 
> [...]
> 11.274228 = (MATCH) weight(exactTitle:some words^100.0 in 172) 
> [DefaultSimilarity], result of:
> [...]
> 
> So. In my case, having minClauseSize=2 on line 550 (line 565 for solr 5.5.0) 
> is the culprit. Is this a bug, or am I using the pf in the wrong way? Can 
> someone explain why minClauseSize can't be set to 0 here? The comment simply 
> states "we need at least two or there shouldn't be a boost", but no 
> explaination *why* at least two is needed.
> 
> Regards
> /Jimi
> 
> -Original Message-
> From: jimi.hulleg...@svensktnaringsliv.se 
> [mailto:jimi.hulleg...@svensktnaringsliv.se] 
> Sent: Tuesday, April 5, 2016 6:51 PM
> To: solr-user@lucene.apache.org
> Subject: RE: Can't get phrase field boosting to work using edismax
> 
> I now used the Eclipse debugger, to try and see if I can understand what is 
> happening, I it seems like the ExtendedDismaxQParser simply ignores my pf 
> parameter, since it doesn't interpret it as a phrase query.
> 
> https://github.com/apache/lucene-solr/blob/releases/lucene-solr/4.6.0/solr/core/src/java/org/apache/solr/search/ExtendedDismaxQParser.java
> 
> On line 1180 I get a query object of type TermQuery (with the term 
> "exactTitle:some words"). And in the if statements starting at line it is 
> quite clear that if it is not a PhraseQuery or a MultiPhraseQuery, or if the 
> minClauseSize > 1 (and it is set to 2 on line 550) the method simply returns 
> null (ie ignoring my pf parameter). Why is this happening?
> 
> I use Solr 4.6 by the way... I forgot to mention that in my original message.
> 
> 
> -Original Message-
> From: jimi.hulleg...@svensktnaringsliv.se 
> [mailto:jimi.hulleg...@svensktnaringsliv.se]
> Sent: Tuesday, April 5, 2016 5:36 PM
> To: solr-user@lucene.apache.org
> Subject: RE: Can't get phrase field boosting to work using edismax
> 
> OK. Interesting. But... I added a solr.TrimFilterFactory at the end of my 
> analyzer definition. Shouldn't that take care of the added space at the end? 
> The admin analysis page indicates that it works as it should, but I still 
> can't get edismax to boost.
> 
> -Original Message-
> From: Jack Krupansky [mailto:jack.krupan...@gmail.com]
> Sent: Tuesday, April 5, 2016 4:42 PM
> To: solr-user@lucene.apache.org
> Subject: Re: Can't get phrase field boosting to work using edismax
> 
> It looks like the code constructing the boost phrase for pf will always add a 
> trailing blank, which is never a problem when a normal tokenizer is used that 
> removes white space, but the keyword tokenizer will preserve that extra 
> space, which prevents an exact match.
> 
> See line 531:
> https://github.com/apache/lucene-solr/blob/releases/lucene-solr/5.5.0/solr/core/src/java/org/apache/solr/search/ExtendedDismaxQParser.java
> 
> I'd say it's a bug, but more a narrow use case that wasn't considered or 
> tested.
> 
> -- Jack Krupansky
> 
> On Tue, Apr 5, 2016 at 7:50 AM,  wrote:
> 
>> Hi,
>> 
>> I'm trying to boost documents using a phrase field boosting (ie the pf 
>> parameter for edismax), but I can't get it to work (ie boosting 
>> documents where the pf field match the query as a phrase).
>> 
>> As far as I can tell, solr, or more specifically the edismax handler, 
>> does
>> *something* when I add this parameter. I know this because the QTime 
>> increases from around 5-10ms to around 30-40 ms, and the score explain 
>> structure is *slightly* modified (though with the same final score for 
>> all documents). But nowhere in the explain structure can I see 
>> anything about the pf. And I can't understand that. Shouldn't it be 
>> included in the explain? If not, is there any way to force it to be included 
>> somehow?
>> 
>> The query looks something like this:
>> 
>> 
>> ?q=some+words&rows=10&sort=score+desc&debugQuery=true&fl=objectid,exac
>> tTitle,score%2C%5Bexplain+style%3Dtext%5D&qf=title%5E2&qf=swedishText1
>> %5E1&defType=edismax&pf=exactTitle%5E5&wt=xml&indent=true
>>

RE: Can't get phrase field boosting to work using edismax