Translating SQL having clause in Solr query

2018-08-07 Thread SAGAR PALAO
Hi Guys,



I am new to the Solr world and I am using Solr 5.4.1. I have a question and
hope to find answer to the same.



Consider a SQL query as:



select A, sum(B), sum(C)

from Col1

where D = ‘Test’

group by A

*having sum(B) > 20*

order by A

limit 100



I am able to translate the above to Solr Query partially using pivot facet.
But how do I have the having clause implemented in my Solr Query. Without
having clause in place the number of records is too large and so the
pivoting on A is returning a very large result. In addition the number of
distinct value for A is also very high.



Please advice me how to solve this problem.



P.S. It is not possible for me to use Solr SQL of Solr 6 which allows above
SQL to execute.



Regards,

Sagar Palao


Translating SQL having clause in Solr query

2018-08-07 Thread SAGAR PALO
Hi Guys,

I am new to the Solr world and I am using Solr 5.4.1. I have a question and 
hope to find answer to the same.

Consider a SQL query as:

select A, sum(B), sum(C)
from Col1
where D = ‘Test’
group by A
having sum(B) > 20
order by A
limit 100

I am able to translate the above to Solr Query partially using pivot facet. But 
how do I have the having clause implemented in my Solr Query. Without having 
clause in place the number of records is too large and so the pivoting on A is 
returning a very large result. In addition the number of distinct value for A 
is also very high.

Please advice me how to solve this problem.

P.S. It is not possible for me to use Solr SQL of Solr 6 which allows above SQL 
to execute. 

Regards,
Sagar Palao


Translating SQL having clause in Solr query

2018-08-07 Thread SAGAR PALAO
Hi Guys,



I am new to the Solr world and I am using Solr 5.4.1. I have a question and
hope to find answer to the same.



Consider a SQL query as:



select A, sum(B), sum(C)

from Col1

where D = ‘Test’

group by A

*having sum(B) > 20*

order by A

limit 100



I am able to translate the above to Solr Query partially using pivot facet.
But how do I have the having clause implemented in my Solr Query. Without
having clause in place the number of records is too large and so the
pivoting on A is returning a very large result. In addition the number of
distinct value for A is also very high.



Please advice me how to solve this problem.



P.S. It is not possible for me to use Solr SQL of Solr 6 which allows above
SQL to execute.



Regards,

Sagar Palao


Re: Problem with fuzzy search and accentuation

2018-08-07 Thread Monique Monteiro
Hi Erick,

In fact, stemming was the culprit for the problem.

Thanks!
Monique Monteiro

On Fri, Aug 3, 2018 at 3:45 PM Erick Erickson 
wrote:

> Stemming is getting in the way here. You could probably use copyField
> to a field that doesn't stem and fuzzy search against that field
> rather than the stemmed one.
>
> Best,
> Erick
>
> On Fri, Aug 3, 2018 at 11:31 AM, Monique Monteiro
>  wrote:
> > By adding debug=true, I get the following:
> >
> >
> >- administração (correct result):
> >
> > "debug":{
> > "rawquerystring":"administração",
> > "querystring":"administração",
> > "parsedquery":"text:administr",
> > "parsedquery_toString":"text:administr",
> > "QParser":"LuceneQParser"}}
> >
> >
> >- administração~ (incorrect behaviour, no results):
> >
> > "debug":{
> > "rawquerystring":"administração~",
> > "querystring":"administração~",
> > "parsedquery":"text:administração~2",
> > "parsedquery_toString":"text:administração~2",
> > "QParser":"LuceneQParser"}}
> >
> >
> >- tribunal (correct result):
> >
> > "debug":{
> > "rawquerystring":"tribunal",
> > "querystring":"tribunal",
> > "parsedquery":"text:tribunal",
> > "parsedquery_toString":"text:tribunal",
> > "QParser":"LuceneQParser"}}
> >
> >
> >- tribubal (correct result, no accents):
> >
> >  "debug":{
> > "rawquerystring":"tribubal~",
> > "querystring":"tribubal~",
> > "parsedquery":"text:tribubal~2",
> > "parsedquery_toString":"text:tribubal~2",
> > "QParser":"LuceneQParser"}}
> >
> > On Fri, Aug 3, 2018 at 3:26 PM Erick Erickson 
> > wrote:
> >
> >> What does adding &debug=query show you the parsed query is in the two
> >> cases?
> >>
> >> My guess is that accent folding is kicking in one case but not the
> >> other, but that's
> >> a blind guess.
> >>
> >>
> >>
> >> On Fri, Aug 3, 2018 at 11:19 AM, Monique Monteiro
> >>  wrote:
> >> > Hi all,
> >> >
> >> > I'm having a problem when I search for a word with some non-ASCII
> >> > characters in combination with fuzzy search.
> >> >
> >> > For example, if I type 'administração' or 'contratação' (both words
> end
> >> > with 'ção'), the search results are returned correctly.  However, if I
> >> type
> >> > 'administração~', no result is returned.  For other terms, I haven't
> >> found
> >> > any problem.
> >> >
> >> > My Solr version is  6.6.3.
> >> >
> >> > Has anyone any idea about what may cause this issue?
> >> >
> >> > Thanks in advance.
> >> >
> >> > --
> >> > Monique Monteiro
> >> > Twitter: http://twitter.com/monilouise
> >>
> >
> >
> > --
> > Monique Monteiro
> > Twitter: http://twitter.com/monilouise
>


-- 
Monique Monteiro
Twitter: http://twitter.com/monilouise


Re: deprecated field types

2018-08-07 Thread Steve Rowe
I created a JIRA issue to track the Trie field removal effort: 
https://issues.apache.org/jira/browse/SOLR-12632

--
Steve
www.lucidworks.com

> On Aug 7, 2018, at 11:48 AM, Shawn Heisey  wrote:
> 
> On 8/6/2018 11:52 PM, Hendrik Haddorp wrote:
>> Below the table the following is stated:
>> /All Trie* numeric and date field types have been deprecated in favor of 
>> *Point field types. Point field types are better at range queries (speed, 
>> memory, disk), however simple field:value queries underperform relative to 
>> Trie. Either accept this, or continue to use Trie fields. This shortcoming 
>> may be addressed in a future release./
>> 
>> Given that it is suggested that one can keep using these fields can I expect 
>> that the types are not being removed in Solr 8?
> 
> As far as I know, they will be removed in 8.  They WOULD have been removed in 
> 7.0 -- the underlying Lucene legacy numeric classes that make the Trie fields 
> possible were removed in Lucene 7.  By including the source code for the 
> legacy numeric classes in the Solr codebase, we were able to keep the Trie 
> fields around for another major version.  This was extremely important for 
> backwards compatibility -- without doing that, it would not have been 
> possible for Solr 7 to read a large fraction of existing Solr 6 indexes that 
> were NOT using deprecated Solr 6 classes.
> 
> Even if Trie fields are removed, Solr 8 will be able to read any Solr 7 index 
> that is not using deprecated types.  This is the historical backward 
> compatibility guarantee that Solr has always had.
> 
> Thanks,
> Shawn
> 



Re: executing /suggest in Admin Console

2018-08-07 Thread Shawn Heisey
On 8/6/2018 11:18 AM, Steve Pruitt wrote:
> Changing the request handler to /suggest in the Admin Console Query panel 
> doesn't work.  It was a guess on my part to see if it would.
>
> Is the way to do this, or do I need to always use browser, postman, etc. for 
> debugging?

I've never used the suggest functionality in Solr, but I checked the
documentation, and the parameters look very different than a regular
query.  The query interface in the admin UI is designed for regular queries.

If you leave everything blank, and put the parameters you need into the
"Raw Query Parameters" box separated with ampersands, it might work. 
Note that you'll probably have to manually URL encode any special
characters in that box.  Unlike the other boxes, the raw box is designed
to allow characters that are special in URLs.  I do not know if you need
to blank out the "q" box (which defaults to a value of *:*), but it
wouldn't surprise me.

Thanks,
Shawn



Re: deprecated field types

2018-08-07 Thread Shawn Heisey

On 8/6/2018 11:52 PM, Hendrik Haddorp wrote:

Below the table the following is stated:
/All Trie* numeric and date field types have been deprecated in favor 
of *Point field types. Point field types are better at range queries 
(speed, memory, disk), however simple field:value queries underperform 
relative to Trie. Either accept this, or continue to use Trie fields. 
This shortcoming may be addressed in a future release./


Given that it is suggested that one can keep using these fields can I 
expect that the types are not being removed in Solr 8?


As far as I know, they will be removed in 8.  They WOULD have been 
removed in 7.0 -- the underlying Lucene legacy numeric classes that make 
the Trie fields possible were removed in Lucene 7.  By including the 
source code for the legacy numeric classes in the Solr codebase, we were 
able to keep the Trie fields around for another major version.  This was 
extremely important for backwards compatibility -- without doing that, 
it would not have been possible for Solr 7 to read a large fraction of 
existing Solr 6 indexes that were NOT using deprecated Solr 6 classes.


Even if Trie fields are removed, Solr 8 will be able to read any Solr 7 
index that is not using deprecated types.  This is the historical 
backward compatibility guarantee that Solr has always had.


Thanks,
Shawn



Re: Recipe for moving to solr cloud without reindexing

2018-08-07 Thread Erick Erickson
Bjarke:

One thing, what version of Solr are you moving _from_ and _to_?
Solr/Lucene only guarantee one major backward revision so you can copy
an index created with Solr 6 to another Solr 6 or Solr 7, but you
couldn't copy an index created with Solr 5 to Solr 7...

Also note that shard splitting is a very expensive operation, so be patient

Best,
Erick

On Tue, Aug 7, 2018 at 6:17 AM, Rahul Singh
 wrote:
> Bjarke,
>
> I am imagining that at some point you may need to shard that data if it 
> grows. Or do you imagine this data to remain stagnant?
>
> Generally you want to add solrcloud to do two things : 1. Increase 
> availability with replicas 2. Increase available data via shards 3. Increase 
> fault tolerance with leader and replicas being spread around the cluster.
>
> You would be bypassing general High availability / distributed computing 
> processes by trying to not reindex.
>
> Rahul
> On Aug 7, 2018, 7:06 AM -0400, Bjarke Buur Mortensen , 
> wrote:
>> Hi List,
>>
>> is there a cookbook recipe for moving an existing solr core to a solr cloud
>> collection.
>>
>> We currently have a single machine with a large core (~150gb), and we would
>> like to move to solr cloud.
>>
>> I haven't been able to find anything that reuses an existing index, so any
>> pointers much appreciated.
>>
>> Thanks,
>> Bjarke


Re: Recipe for moving to solr cloud without reindexing

2018-08-07 Thread Rahul Singh
Bjarke,

I am imagining that at some point you may need to shard that data if it grows. 
Or do you imagine this data to remain stagnant?

Generally you want to add solrcloud to do two things : 1. Increase availability 
with replicas 2. Increase available data via shards 3. Increase fault tolerance 
with leader and replicas being spread around the cluster.

You would be bypassing general High availability / distributed computing 
processes by trying to not reindex.

Rahul
On Aug 7, 2018, 7:06 AM -0400, Bjarke Buur Mortensen , 
wrote:
> Hi List,
>
> is there a cookbook recipe for moving an existing solr core to a solr cloud
> collection.
>
> We currently have a single machine with a large core (~150gb), and we would
> like to move to solr cloud.
>
> I haven't been able to find anything that reuses an existing index, so any
> pointers much appreciated.
>
> Thanks,
> Bjarke


Re: Solr cloud in kubernetes

2018-08-07 Thread jonasdkhansen
Hi, i also got this running
https://github.com/freedev/solrcloud-zookeeper-kubernetes  

My problem is also that instance 2 and 3 will not start. If i exec into
them, and run bin/solr start -cloud, then i can start them on another port
than 32080, but thats not what we want.

Is anyone having a solution to this yet ? 



--
Sent from: http://lucene.472066.n3.nabble.com/Solr-User-f472068.html


Re: SolrQuery Highlight Problem - Not working for text

2018-08-07 Thread THADC
Thanks Erick, that indeed was the problem. All good now!



--
Sent from: http://lucene.472066.n3.nabble.com/Solr-User-f472068.html


Re: Recipe for moving to solr cloud without reindexing

2018-08-07 Thread Bjarke Buur Mortensen
Right, that seems like a way to go, will give it a try.

Thanks!
/Bjarke

2018-08-07 14:08 GMT+02:00 Markus Jelsma :

> Hello Bjarke,
>
> You can use shard splitting:
> https://lucene.apache.org/solr/guide/6_6/collections-
> api.html#CollectionsAPI-splitshard
>
> Regards,
> Markus
>
>
>
> -Original message-
> > From:Bjarke Buur Mortensen 
> > Sent: Tuesday 7th August 2018 13:47
> > To: solr-user@lucene.apache.org
> > Subject: Re: Recipe for moving to solr cloud without reindexing
> >
> > Thank you, that is of course a way to go, but I would actually like to be
> > able to shard ...
> > Could I use your approach and add shards dynamically?
> >
> >
> > 2018-08-07 13:28 GMT+02:00 Markus Jelsma :
> >
> > > Hello Bjarke,
> > >
> > > If you are not going to shard you can just create a 1 shard/1 replica
> > > collection, shut down Solr, copy the data directory into the replica's
> > > directory and start up again.
> > >
> > > Regards,
> > > Markus
> > >
> > > -Original message-
> > > > From:Bjarke Buur Mortensen 
> > > > Sent: Tuesday 7th August 2018 13:06
> > > > To: solr-user@lucene.apache.org
> > > > Subject: Recipe for moving to solr cloud without reindexing
> > > >
> > > > Hi List,
> > > >
> > > > is there a cookbook recipe for moving an existing solr core to a solr
> > > cloud
> > > > collection.
> > > >
> > > > We currently have a single machine with a large core (~150gb), and we
> > > would
> > > > like to move to solr cloud.
> > > >
> > > > I haven't been able to find anything that reuses an existing index,
> so
> > > any
> > > > pointers much appreciated.
> > > >
> > > > Thanks,
> > > > Bjarke
> > > >
> > >
> >
>


RE: Recipe for moving to solr cloud without reindexing

2018-08-07 Thread Markus Jelsma
Hello Bjarke,

You can use shard splitting:
https://lucene.apache.org/solr/guide/6_6/collections-api.html#CollectionsAPI-splitshard

Regards,
Markus

 
 
-Original message-
> From:Bjarke Buur Mortensen 
> Sent: Tuesday 7th August 2018 13:47
> To: solr-user@lucene.apache.org
> Subject: Re: Recipe for moving to solr cloud without reindexing
> 
> Thank you, that is of course a way to go, but I would actually like to be
> able to shard ...
> Could I use your approach and add shards dynamically?
> 
> 
> 2018-08-07 13:28 GMT+02:00 Markus Jelsma :
> 
> > Hello Bjarke,
> >
> > If you are not going to shard you can just create a 1 shard/1 replica
> > collection, shut down Solr, copy the data directory into the replica's
> > directory and start up again.
> >
> > Regards,
> > Markus
> >
> > -Original message-
> > > From:Bjarke Buur Mortensen 
> > > Sent: Tuesday 7th August 2018 13:06
> > > To: solr-user@lucene.apache.org
> > > Subject: Recipe for moving to solr cloud without reindexing
> > >
> > > Hi List,
> > >
> > > is there a cookbook recipe for moving an existing solr core to a solr
> > cloud
> > > collection.
> > >
> > > We currently have a single machine with a large core (~150gb), and we
> > would
> > > like to move to solr cloud.
> > >
> > > I haven't been able to find anything that reuses an existing index, so
> > any
> > > pointers much appreciated.
> > >
> > > Thanks,
> > > Bjarke
> > >
> >
> 


Re: Recipe for moving to solr cloud without reindexing

2018-08-07 Thread Bjarke Buur Mortensen
Thank you, that is of course a way to go, but I would actually like to be
able to shard ...
Could I use your approach and add shards dynamically?


2018-08-07 13:28 GMT+02:00 Markus Jelsma :

> Hello Bjarke,
>
> If you are not going to shard you can just create a 1 shard/1 replica
> collection, shut down Solr, copy the data directory into the replica's
> directory and start up again.
>
> Regards,
> Markus
>
> -Original message-
> > From:Bjarke Buur Mortensen 
> > Sent: Tuesday 7th August 2018 13:06
> > To: solr-user@lucene.apache.org
> > Subject: Recipe for moving to solr cloud without reindexing
> >
> > Hi List,
> >
> > is there a cookbook recipe for moving an existing solr core to a solr
> cloud
> > collection.
> >
> > We currently have a single machine with a large core (~150gb), and we
> would
> > like to move to solr cloud.
> >
> > I haven't been able to find anything that reuses an existing index, so
> any
> > pointers much appreciated.
> >
> > Thanks,
> > Bjarke
> >
>


RE: Recipe for moving to solr cloud without reindexing

2018-08-07 Thread Markus Jelsma
Hello Bjarke,

If you are not going to shard you can just create a 1 shard/1 replica 
collection, shut down Solr, copy the data directory into the replica's 
directory and start up again.

Regards,
Markus
 
-Original message-
> From:Bjarke Buur Mortensen 
> Sent: Tuesday 7th August 2018 13:06
> To: solr-user@lucene.apache.org
> Subject: Recipe for moving to solr cloud without reindexing
> 
> Hi List,
> 
> is there a cookbook recipe for moving an existing solr core to a solr cloud
> collection.
> 
> We currently have a single machine with a large core (~150gb), and we would
> like to move to solr cloud.
> 
> I haven't been able to find anything that reuses an existing index, so any
> pointers much appreciated.
> 
> Thanks,
> Bjarke
> 


Recipe for moving to solr cloud without reindexing

2018-08-07 Thread Bjarke Buur Mortensen
Hi List,

is there a cookbook recipe for moving an existing solr core to a solr cloud
collection.

We currently have a single machine with a large core (~150gb), and we would
like to move to solr cloud.

I haven't been able to find anything that reuses an existing index, so any
pointers much appreciated.

Thanks,
Bjarke


Re: Solr timeAllowed metric

2018-08-07 Thread Mikhail Khludnev
facet.method=fc not facet.method=fsc are not, but facet.method=enum has a
chance to be stopped when  timeAllowed is exceeded.

On Tue, Aug 7, 2018 at 9:40 AM Wei  wrote:

> Thanks Mikhail! Is traditional facet subject to timeAllowed?
>
> On Mon, Aug 6, 2018 at 3:46 AM, Mikhail Khludnev  wrote:
>
> > One note: enum facets might be stopped by timeAllowed.
> >
> > On Mon, Aug 6, 2018 at 1:45 PM Mikhail Khludnev  wrote:
> >
> > > Hello, Wei.
> > >
> > > "Document collection" is done along side with "scoring process". So,
> > Solr
> > > will abort the request if
> > > timeAllowed is exceeded during the scoring process.
> > > Query, MLT, grouping are subject of timeAllowed constrains, but facet,
> > > json.facet https://issues.apache.org/jira/browse/SOLR-12478, stats,
> > debug
> > > are not.
> > >
> > > On Fri, Aug 3, 2018 at 11:34 PM Wei  wrote:
> > >
> > >> Hi,
> > >>
> > >> We tried to use solr's timeAllowed parameter to restrict the time
> spend
> > on
> > >> expensive queries.  But as described at
> > >>
> > >>
> > >>
> https://lucene.apache.org/solr/guide/6_6/common-query-parameters.html#
> > CommonQueryParameters-ThetimeAllowedParameter
> > >>
> > >> " This value is only checked at the time of Query Expansion and
> Document
> > >> collection" .  Does that mean Solr will not abort the request if
> > >> timeAllowed is exceeded during the scoring process? What are the
> > >> components
> > >> (query, facet,  stats, debug etc) this metric is effectively used?
> > >>
> > >> Thanks,
> > >> Wei
> > >>
> > >
> > >
> > > --
> > > Sincerely yours
> > > Mikhail Khludnev
> > >
> >
> >
> > --
> > Sincerely yours
> > Mikhail Khludnev
> >
>


-- 
Sincerely yours
Mikhail Khludnev


Re: Highlighting the search keywords

2018-08-07 Thread Renuka Srishti
First of all thanks to all, its a great community and amazing experience to
work with Apache Solr.

I was trying to use highlight component inside suggest request handler by
mentioning highlight component like this:

 suggest
highlight


So I can use suggestions and highlighter at the same time, but its not
working. Am I missing something?


Thanks
Renuka Srishti

On Wed 1 Aug, 2018, 12:05 Nicolas Franck,  wrote:

> Nope, that is how it works. It is not in place.
>
> > On 31 Jul 2018, at 21:57, Renuka Srishti 
> wrote:
> >
> > Hi All,
> >
> > I was using highlighting in solr, solr gives highlighting results within
> > the response but not included within the documents.
> > Am i missing something? Can i configure so that it can show highlighted
> > keywords matched within the documents.
> >
> > Thanks
> > Renuka Srishti
>
>


Re: Block Join Faceting in Solr 7.2

2018-08-07 Thread Mikhail Khludnev
 uniqueBlock is not faster than BlockJoinFacet in 7.4.

On Tue, Aug 7, 2018 at 8:05 AM Aditya Gandhi  wrote:

> I'm querying an Index which has two types of child documents (let's call
> them ChildTypeA and ChildTypeB)
> I wrap the subqueries for each of these documents in a boolean clause,
> something like this:
>
> *q=+{! parent which=type:parent } +{! parent
> which=type:parent }*
>
>
> I've been trying to get facet counts on documents of ChildTypeA (rolled up
> by parent) and I've tried the following approaches
>
>
>- Tried Block Join Faceting using the JSON API  i.e. using the
>unique(_root_) approach.
>   -  Enabled docValues on _root_
>   - *This did not scale well*
>- Tried using the BlockJoinFacet component.
>   - Had to customize it since it expects that only one
>   *ToParentBlockJoinQuery* clause to be present in the query.
>   - Since I needed facet counts only on ChildTypeA, I changed it to
>   ignore the clause on ChildTypeB
>   - I did not enable docValues on _root_ since it was not mentioned in
>   the documentation.
>   - *This approach did not scale well*
>
> I needed advice on whether I could have done anything better in any one of
> the two approached I've tried so far. Also if there exists some other
> approached I could try.
> Would using the uniqueBlock in 7.4 help? (Though this would require me to
> upgrade my Solr version)
>


-- 
Sincerely yours
Mikhail Khludnev