Re: Need some help on solr versions (LTS vs stable)

2019-11-06 Thread Erick Erickson
Pretty much correct. The only change I’d make is that 7x is not actively being 
supported in the sense that only seriously critical bugs will be addressed.   
You’ll note that the last release of 7x was 7.7.2 in early June. Increased 
functionality, speedups, etc won’t be back-ported. 

So I can’t think of any reason to go with 7x over 8x if you’re starting 
something new.

Best,
Erick

> On Nov 6, 2019, at 11:58 AM, suyog joshi  wrote:
> 
> Hi Erick,
> 
> Thank you so much for sharing detailed information, indeed its really
> helpful for us to plan out the things. Really appreciate your guidance.
> 
> So we can say its better to go with latest stable version (8.x) instead of
> 7.x, which is LTS right now, but can soon become EOL post launching of 9.x
> sometime early next year.
> 
> Kindly correct me, if missed out something !
> 
> Will reach out to you/community, in case any additional info is needed.
> 
> Once again, thanks much !!
> 
> Regards,
> Suyog Joshi
> 
> 
> 
> --
> Sent from: https://lucene.472066.n3.nabble.com/Solr-User-f472068.html



Re: Need some help on solr versions (LTS vs stable)

2019-11-06 Thread suyog joshi
Hi Erick,

Thank you so much for sharing detailed information, indeed its really
helpful for us to plan out the things. Really appreciate your guidance.

So we can say its better to go with latest stable version (8.x) instead of
7.x, which is LTS right now, but can soon become EOL post launching of 9.x
sometime early next year.

Kindly correct me, if missed out something !

Will reach out to you/community, in case any additional info is needed.

Once again, thanks much !!

Regards,
Suyog Joshi



--
Sent from: https://lucene.472066.n3.nabble.com/Solr-User-f472068.html


Re: Need some help on solr versions (LTS vs stable)

2019-11-06 Thread Erick Erickson
It’s variable. The policy is that we try very hard to maintain one major 
version back-compat. So generally, if you start with, say, 7x upgrading to 8x 
should be relatively straightforward. However, you will _not_ be able to 
upgrade from 7x to 9x, you must re-index everything from scratch.

The development process is this:

- People work on “master”, the future 9.0

- most changes are back-ported to the current one-less-major-version, in this 
case 8x. Periodically (on no fixed schedule, but usually 3-4 times a year) a 
new 8x version is released. Some changes to master are not backported as they 
are major changes that would be difficult/impossible to backport.

- at some point, especially when enough non-backported changes have 
accumulated, we decide to release 9.0 and everything bumps up one, i.e. master 
is the future 10.0, work is done there and backported to the stable 9x 

- In the current situation, where work is done on the future 9.0 and 8.x is the 
stable branch, there will be _no_ work done on 7x excepting egregious problems 
which at this point are pretty much exclusively security vulnerabilities. 

- As I said, it’s variable. I expect 9.0 to happen sometime in the first half 
of next year, but there are no solid plans for that, it’s just how I personally 
think things are shaping up.

- Finally, the transition from the last release of a major version to the first 
release of a new major version is _usually_ not a huge deal. New major releases 
are free to remove deprecated methods and processes though, so that’s one thing 
to watch for.

So in a nutshell, if you are starting a new project you have two choices:

- use the latest 8.x. that’ll get you the longest period when fixes will be 
made to that branch, although development will taper off on that branch as 9.0 
gets released. A variant here is to start with 8x, and if 9.0 gets released 
before go-live, try upgrading part way through the project.

- If your time-frame is long enough, start with master (the future 9.0) which 
you’ll have to compile yourself, understanding that
  - it may be unstable
  - the timeframe for an official release is not fixed.



> On Nov 6, 2019, at 1:00 AM, suyog joshi  wrote:
> 
> Hi Team,
> 
> Can you please guide us on below queries for solr versions ?
> 
> 1. Are there any major differences (for security, platform stability etc)
> between  current LTS and Stable Solr version ?
> 2. How long a version remains in LTS before becoming EoL ?
> 3. How frequently LTS version gets changed ?
> 3. What will be the next LTS version for Solr(current is 7.7.x) ?
> 
> 
> Kindly advice, you guidance will be really helpful for us to select correct
> version in our infra.
> 
> Regards,
> Suyog Joshi
> 
> 
> 
> --
> Sent from: https://lucene.472066.n3.nabble.com/Solr-User-f472068.html



Need some help on solr versions (LTS vs stable)

2019-11-05 Thread suyog joshi
Hi Team,

Can you please guide us on below queries for solr versions ?

1. Are there any major differences (for security, platform stability etc)
between  current LTS and Stable Solr version ?
2. How long a version remains in LTS before becoming EoL ?
3. How frequently LTS version gets changed ?
3. What will be the next LTS version for Solr(current is 7.7.x) ?


Kindly advice, you guidance will be really helpful for us to select correct
version in our infra.

Regards,
Suyog Joshi



--
Sent from: https://lucene.472066.n3.nabble.com/Solr-User-f472068.html


Re: Help to learn synonymQueryStyle and FieldTypeSimilarity

2019-10-22 Thread Paras Lehana
Please care to explain more about what you need to understand, what you had
understood and what are your requirements.

On Fri, 18 Oct 2019 at 12:19, Shubham Goswami 
wrote:

> Hi Community
>
> I am a beginner in solr and i am trying to understand the working of
> synonymQueryStyle and FieldTypeSimilarity but i am still not clear how it
> exactly works.
> Can somebody please help me to understand this with the help of an example
> ?
> Any help will be appreciated. Thanks in advance.
>
> --
> *Thanks & Regards*
> Shubham Goswami
> Enterprise Software Engineer
> *HotWax Systems*
> *Enterprise open source experts*
> cell: +91-7803886288
> office: 0731-409-3684
> http://www.hotwaxsystems.com
>


-- 
-- 
Regards,

*Paras Lehana* [65871]
Software Programmer, Auto-Suggest,
IndiaMART Intermesh Ltd.

8th Floor, Tower A, Advant-Navis Business Park, Sector 142,
Noida, UP, IN - 201303

Mob.: +91-9560911996
Work: 01203916600 | Extn:  *8173*

-- 
IMPORTANT: 
NEVER share your IndiaMART OTP/ Password with anyone.


Re: Help with Stream Graph

2019-10-20 Thread Rajeswari Natarajan
Thanks Joel.  That fixed the problem.

Regards,
Rajeswari

On Fri, Oct 18, 2019 at 12:50 PM Joel Bernstein  wrote:

> The query that is created to me looks looked good but it returns no
> results. Let's just do a basic query using the select handler:
>
> product_s:product1
>
> If this brings back zero results then we know we have a problem with the
> data.
>
> Joel Bernstein
> http://joelsolr.blogspot.com/
>
>
> On Fri, Oct 18, 2019 at 1:41 PM Rajeswari Natarajan 
> wrote:
>
> > Hi Joel,
> >
> > Do you see anything wrong in the config or data . I am using 7.6.
> >
> > Thanks,
> > Rajeswari
> >
> > On Thu, Oct 17, 2019 at 8:36 AM Rajeswari Natarajan 
> > wrote:
> >
> > > My config is from
> > >
> > >
> > >
> >
> https://github.com/apache/lucene-solr/tree/branch_7_6/solr/solrj/src/test-files/solrj/solr/configsets/streaming/conf
> > >
> > >
> > >  />
> > >
> > >  > > docValues="true"/>
> > >
> > >
> > >
> > > 
> > >
> > >  > omitNorms="true"
> > > positionIncrementGap="0"/>
> > >
> > >
> > >
> > > Thanks,
> > >
> > > Rajeswari
> > >
> > > On Thu, Oct 17, 2019 at 8:16 AM Rajeswari Natarajan <
> rajis...@gmail.com>
> > > wrote:
> > >
> > >> I tried below query  and it returns o results
> > >>
> > >>
> > >>
> >
> http://localhost:8983/solr/knr/export?{!terms+f%3Dproduct_s}product1=false=basket_s,product_s=basket_s+asc,product_s+asc=json=2.2
> 
> > <
> http://localhost:8983/solr/knr/export?%7B!terms+f%3Dproduct_s%7Dproduct1=false=basket_s,product_s=basket_s+asc,product_s+asc=json=2.2
> >
> > >> <
> >
> http://localhost:8983/solr/knr/export?%7B!terms+f%3Dproduct_s%7Dproduct1=false=basket_s,product_s=basket_s+asc,product_s+asc=json=2.2
> > >
> > >>
> > >>
> > >> {
> > >>   "responseHeader":{"status":0},
> > >>   "response":{
> > >> "numFound":0,
> > >> "docs":[]}}
> > >>
> > >> Regards,
> > >> Rajeswari
> > >> On Thu, Oct 17, 2019 at 8:05 AM Rajeswari Natarajan <
> rajis...@gmail.com
> > >
> > >> wrote:
> > >>
> > >>> Thanks Joel.
> > >>>
> > >>> Here is the logs for below request
> > >>>
> > >>> curl --data-urlencode
> > >>> 'expr=gatherNodes(knr,walk="product1->product_s",gather="basket_s")'
> > >>> http://localhost:8983/solr/knr/stream
> > >>>
> > >>> 2019-10-17 15:02:06.969 INFO  (qtp952486988-280) [c:knr s:shard1
> > >>> r:core_node2 x:knr_shard1_replica_n1] o.a.s.c.S.Request
> > >>> [knr_shard1_replica_n1]  webapp=/solr path=/stream
> > >>>
> >
> params={expr=gatherNodes(knr,walk%3D"product1->product_s",gather%3D"basket_s")}
> > >>> status=0 QTime=0
> > >>>
> > >>> 2019-10-17 15:02:06.975 INFO  (qtp952486988-192) [c:knr s:shard1
> > >>> r:core_node2 x:knr_shard1_replica_n1] o.a.s.c.S.Request
> > >>> [knr_shard1_replica_n1]  webapp=/solr path=/export
> > >>>
> >
> params={q={!terms+f%3Dproduct_s}product1=false=off=basket_s,product_s=basket_s+asc,product_s+asc=json=2.2}
> > >>> hits=0 status=0 QTime=1
> > >>>
> > >>>
> > >>>
> > >>> Here is the logs for
> > >>>
> > >>>
> > >>>
> > >>> curl --data-urlencode
> > >>>
> >
> 'expr=gatherNodes(knr,walk="product1->product_s",gather="basket_s",scatter="branches,
> > >>> leaves")' http://localhost:8983/solr/knr/stream
> > >>>
> > >>>
> > >>> 2019-10-17 15:03:57.068 INFO  (qtp952486988-356) [c:knr s:shard1
> > >>> r:core_node2 x:knr_shard1_replica_n1] o.a.s.c.S.Request
> > >>> [knr_shard1_replica_n1]  webapp=/solr path=/stream
> > >>>
> >
> params={expr=gatherNodes(knr,walk%3D"product1->product_s",gather%3D"basket_s",scatter%3D"branches,+leaves")}
> > >>> status=0 QTime=0
> > >>>
> > >>> 2019-10-17 15:03:57.071 INFO  (qtp952486988-400) [c:knr s:shard1
> > >>> r:core_node2 x:knr_shard1_replica_n1] o.a.s.c.S.Request
> > >>> [knr_shard1_replica_n1]  webapp=/solr path=/export
> > >>>
> >
> params={q={!terms+f%3Dproduct_s}product1=false=off=basket_s,product_s=basket_s+asc,product_s+asc=json=2.2}
> > >>> hits=0 status=0 QTime=0
> > >>>
> > >>>
> > >>>
> > >>>
> > >>> Thank you,
> > >>>
> > >>> Rajeswari
> > >>>
> > >>> On Thu, Oct 17, 2019 at 5:23 AM Joel Bernstein 
> > >>> wrote:
> > >>>
> >  Can you show the logs from this request. There will be a Solr query
> > that
> >  gets sent with product1 searched against the product_s field. Let's
> > see
> >  how
> >  many documents that query returns.
> > 
> > 
> >  Joel Bernstein
> >  http://joelsolr.blogspot.com/
> > 
> > 
> >  On Thu, Oct 17, 2019 at 1:41 AM Rajeswari Natarajan <
> > rajis...@gmail.com
> >  >
> >  wrote:
> > 
> >  > Hi,
> >  >
> >  > Since the stream graph query for my use case , didn't work as  i
> > took
> >  the
> >  > data from solr source code test and also copied the schema and
> >  > solrconfig.xml from solr 7.6 source code.  Had to substitute few
> >  variables.
> >  >
> >  > Posted below data
> >  >
> >  > curl -X POST http://localhost:8983/solr/knr/update -H
> > 

Re: Help with Stream Graph

2019-10-18 Thread Joel Bernstein
The query that is created to me looks looked good but it returns no
results. Let's just do a basic query using the select handler:

product_s:product1

If this brings back zero results then we know we have a problem with the
data.

Joel Bernstein
http://joelsolr.blogspot.com/


On Fri, Oct 18, 2019 at 1:41 PM Rajeswari Natarajan 
wrote:

> Hi Joel,
>
> Do you see anything wrong in the config or data . I am using 7.6.
>
> Thanks,
> Rajeswari
>
> On Thu, Oct 17, 2019 at 8:36 AM Rajeswari Natarajan 
> wrote:
>
> > My config is from
> >
> >
> >
> https://github.com/apache/lucene-solr/tree/branch_7_6/solr/solrj/src/test-files/solrj/solr/configsets/streaming/conf
> >
> >
> > 
> >
> >  > docValues="true"/>
> >
> >
> >
> > 
> >
> >  omitNorms="true"
> > positionIncrementGap="0"/>
> >
> >
> >
> > Thanks,
> >
> > Rajeswari
> >
> > On Thu, Oct 17, 2019 at 8:16 AM Rajeswari Natarajan 
> > wrote:
> >
> >> I tried below query  and it returns o results
> >>
> >>
> >>
> http://localhost:8983/solr/knr/export?{!terms+f%3Dproduct_s}product1=false=basket_s,product_s=basket_s+asc,product_s+asc=json=2.2
> 
> >> <
> http://localhost:8983/solr/knr/export?%7B!terms+f%3Dproduct_s%7Dproduct1=false=basket_s,product_s=basket_s+asc,product_s+asc=json=2.2
> >
> >>
> >>
> >> {
> >>   "responseHeader":{"status":0},
> >>   "response":{
> >> "numFound":0,
> >> "docs":[]}}
> >>
> >> Regards,
> >> Rajeswari
> >> On Thu, Oct 17, 2019 at 8:05 AM Rajeswari Natarajan  >
> >> wrote:
> >>
> >>> Thanks Joel.
> >>>
> >>> Here is the logs for below request
> >>>
> >>> curl --data-urlencode
> >>> 'expr=gatherNodes(knr,walk="product1->product_s",gather="basket_s")'
> >>> http://localhost:8983/solr/knr/stream
> >>>
> >>> 2019-10-17 15:02:06.969 INFO  (qtp952486988-280) [c:knr s:shard1
> >>> r:core_node2 x:knr_shard1_replica_n1] o.a.s.c.S.Request
> >>> [knr_shard1_replica_n1]  webapp=/solr path=/stream
> >>>
> params={expr=gatherNodes(knr,walk%3D"product1->product_s",gather%3D"basket_s")}
> >>> status=0 QTime=0
> >>>
> >>> 2019-10-17 15:02:06.975 INFO  (qtp952486988-192) [c:knr s:shard1
> >>> r:core_node2 x:knr_shard1_replica_n1] o.a.s.c.S.Request
> >>> [knr_shard1_replica_n1]  webapp=/solr path=/export
> >>>
> params={q={!terms+f%3Dproduct_s}product1=false=off=basket_s,product_s=basket_s+asc,product_s+asc=json=2.2}
> >>> hits=0 status=0 QTime=1
> >>>
> >>>
> >>>
> >>> Here is the logs for
> >>>
> >>>
> >>>
> >>> curl --data-urlencode
> >>>
> 'expr=gatherNodes(knr,walk="product1->product_s",gather="basket_s",scatter="branches,
> >>> leaves")' http://localhost:8983/solr/knr/stream
> >>>
> >>>
> >>> 2019-10-17 15:03:57.068 INFO  (qtp952486988-356) [c:knr s:shard1
> >>> r:core_node2 x:knr_shard1_replica_n1] o.a.s.c.S.Request
> >>> [knr_shard1_replica_n1]  webapp=/solr path=/stream
> >>>
> params={expr=gatherNodes(knr,walk%3D"product1->product_s",gather%3D"basket_s",scatter%3D"branches,+leaves")}
> >>> status=0 QTime=0
> >>>
> >>> 2019-10-17 15:03:57.071 INFO  (qtp952486988-400) [c:knr s:shard1
> >>> r:core_node2 x:knr_shard1_replica_n1] o.a.s.c.S.Request
> >>> [knr_shard1_replica_n1]  webapp=/solr path=/export
> >>>
> params={q={!terms+f%3Dproduct_s}product1=false=off=basket_s,product_s=basket_s+asc,product_s+asc=json=2.2}
> >>> hits=0 status=0 QTime=0
> >>>
> >>>
> >>>
> >>>
> >>> Thank you,
> >>>
> >>> Rajeswari
> >>>
> >>> On Thu, Oct 17, 2019 at 5:23 AM Joel Bernstein 
> >>> wrote:
> >>>
>  Can you show the logs from this request. There will be a Solr query
> that
>  gets sent with product1 searched against the product_s field. Let's
> see
>  how
>  many documents that query returns.
> 
> 
>  Joel Bernstein
>  http://joelsolr.blogspot.com/
> 
> 
>  On Thu, Oct 17, 2019 at 1:41 AM Rajeswari Natarajan <
> rajis...@gmail.com
>  >
>  wrote:
> 
>  > Hi,
>  >
>  > Since the stream graph query for my use case , didn't work as  i
> took
>  the
>  > data from solr source code test and also copied the schema and
>  > solrconfig.xml from solr 7.6 source code.  Had to substitute few
>  variables.
>  >
>  > Posted below data
>  >
>  > curl -X POST http://localhost:8983/solr/knr/update -H
>  > 'Content-type:text/csv' -d '
>  > id, basket_s, product_s, prics_f
>  > 90,basket1,product1,20
>  > 91,basket1,product3,30
>  > 92,basket1,product5,1
>  > 93,basket2,product1,2
>  > 94,basket2,product6,5
>  > 95,basket2,product7,10
>  > 96,basket3,product4,20
>  > 97,basket3,product3,10
>  > 98,basket3,product1,10
>  > 99,basket4,product4,40
>  > 110,basket4,product3,10
>  > 111,basket4,product1,10'
>  > After this I committed and made sure the data got published. to solr
>  >
>  > curl --data-urlencode
>  > 

Re: Help with Stream Graph

2019-10-18 Thread Rajeswari Natarajan
Hi Joel,

Do you see anything wrong in the config or data . I am using 7.6.

Thanks,
Rajeswari

On Thu, Oct 17, 2019 at 8:36 AM Rajeswari Natarajan 
wrote:

> My config is from
>
>
> https://github.com/apache/lucene-solr/tree/branch_7_6/solr/solrj/src/test-files/solrj/solr/configsets/streaming/conf
>
>
> 
>
>  docValues="true"/>
>
>
>
> 
>
>  omitNorms="true"
> positionIncrementGap="0"/>
>
>
>
> Thanks,
>
> Rajeswari
>
> On Thu, Oct 17, 2019 at 8:16 AM Rajeswari Natarajan 
> wrote:
>
>> I tried below query  and it returns o results
>>
>>
>> http://localhost:8983/solr/knr/export?{!terms+f%3Dproduct_s}product1=false=basket_s,product_s=basket_s+asc,product_s+asc=json=2.2
>> 
>>
>>
>> {
>>   "responseHeader":{"status":0},
>>   "response":{
>> "numFound":0,
>> "docs":[]}}
>>
>> Regards,
>> Rajeswari
>> On Thu, Oct 17, 2019 at 8:05 AM Rajeswari Natarajan 
>> wrote:
>>
>>> Thanks Joel.
>>>
>>> Here is the logs for below request
>>>
>>> curl --data-urlencode
>>> 'expr=gatherNodes(knr,walk="product1->product_s",gather="basket_s")'
>>> http://localhost:8983/solr/knr/stream
>>>
>>> 2019-10-17 15:02:06.969 INFO  (qtp952486988-280) [c:knr s:shard1
>>> r:core_node2 x:knr_shard1_replica_n1] o.a.s.c.S.Request
>>> [knr_shard1_replica_n1]  webapp=/solr path=/stream
>>> params={expr=gatherNodes(knr,walk%3D"product1->product_s",gather%3D"basket_s")}
>>> status=0 QTime=0
>>>
>>> 2019-10-17 15:02:06.975 INFO  (qtp952486988-192) [c:knr s:shard1
>>> r:core_node2 x:knr_shard1_replica_n1] o.a.s.c.S.Request
>>> [knr_shard1_replica_n1]  webapp=/solr path=/export
>>> params={q={!terms+f%3Dproduct_s}product1=false=off=basket_s,product_s=basket_s+asc,product_s+asc=json=2.2}
>>> hits=0 status=0 QTime=1
>>>
>>>
>>>
>>> Here is the logs for
>>>
>>>
>>>
>>> curl --data-urlencode
>>> 'expr=gatherNodes(knr,walk="product1->product_s",gather="basket_s",scatter="branches,
>>> leaves")' http://localhost:8983/solr/knr/stream
>>>
>>>
>>> 2019-10-17 15:03:57.068 INFO  (qtp952486988-356) [c:knr s:shard1
>>> r:core_node2 x:knr_shard1_replica_n1] o.a.s.c.S.Request
>>> [knr_shard1_replica_n1]  webapp=/solr path=/stream
>>> params={expr=gatherNodes(knr,walk%3D"product1->product_s",gather%3D"basket_s",scatter%3D"branches,+leaves")}
>>> status=0 QTime=0
>>>
>>> 2019-10-17 15:03:57.071 INFO  (qtp952486988-400) [c:knr s:shard1
>>> r:core_node2 x:knr_shard1_replica_n1] o.a.s.c.S.Request
>>> [knr_shard1_replica_n1]  webapp=/solr path=/export
>>> params={q={!terms+f%3Dproduct_s}product1=false=off=basket_s,product_s=basket_s+asc,product_s+asc=json=2.2}
>>> hits=0 status=0 QTime=0
>>>
>>>
>>>
>>>
>>> Thank you,
>>>
>>> Rajeswari
>>>
>>> On Thu, Oct 17, 2019 at 5:23 AM Joel Bernstein 
>>> wrote:
>>>
 Can you show the logs from this request. There will be a Solr query that
 gets sent with product1 searched against the product_s field. Let's see
 how
 many documents that query returns.


 Joel Bernstein
 http://joelsolr.blogspot.com/


 On Thu, Oct 17, 2019 at 1:41 AM Rajeswari Natarajan >>> >
 wrote:

 > Hi,
 >
 > Since the stream graph query for my use case , didn't work as  i took
 the
 > data from solr source code test and also copied the schema and
 > solrconfig.xml from solr 7.6 source code.  Had to substitute few
 variables.
 >
 > Posted below data
 >
 > curl -X POST http://localhost:8983/solr/knr/update -H
 > 'Content-type:text/csv' -d '
 > id, basket_s, product_s, prics_f
 > 90,basket1,product1,20
 > 91,basket1,product3,30
 > 92,basket1,product5,1
 > 93,basket2,product1,2
 > 94,basket2,product6,5
 > 95,basket2,product7,10
 > 96,basket3,product4,20
 > 97,basket3,product3,10
 > 98,basket3,product1,10
 > 99,basket4,product4,40
 > 110,basket4,product3,10
 > 111,basket4,product1,10'
 > After this I committed and made sure the data got published. to solr
 >
 > curl --data-urlencode
 > 'expr=gatherNodes(knr,walk="product1->product_s",gather="basket_s")'
 > http://localhost:8983/solr/knr/stream
 >
 > {
 >
 >   "result-set":{
 >
 > "docs":[{
 >
 > "EOF":true,
 >
 > "RESPONSE_TIME":4}]}}
 >
 >
 > and if I add *scatter="branches, leaves" , there is one doc.*
 >
 >
 >
 > curl --data-urlencode
 >
 >
 'expr=gatherNodes(knr,walk="product1->product_s",gather="basket_s",scatter="branches,
 > leaves")' http://localhost:8983/solr/knr/stream
 >
 > {
 >
 >   "result-set":{
 >
 > "docs":[{
 >
 > "node":"product1",
 >
 > "collection":"knr",
 >
 > "field":"node",
 >
 > "level":0}
 >
 >   ,{
 >
 > "EOF":true,
 >
 > "RESPONSE_TIME":4}]}}

Help to learn synonymQueryStyle and FieldTypeSimilarity

2019-10-18 Thread Shubham Goswami
Hi Community

I am a beginner in solr and i am trying to understand the working of
synonymQueryStyle and FieldTypeSimilarity but i am still not clear how it
exactly works.
Can somebody please help me to understand this with the help of an example ?
Any help will be appreciated. Thanks in advance.

-- 
*Thanks & Regards*
Shubham Goswami
Enterprise Software Engineer
*HotWax Systems*
*Enterprise open source experts*
cell: +91-7803886288
office: 0731-409-3684
http://www.hotwaxsystems.com


Re: Help with Stream Graph

2019-10-17 Thread Rajeswari Natarajan
My config is from

https://github.com/apache/lucene-solr/tree/branch_7_6/solr/solrj/src/test-files/solrj/solr/configsets/streaming/conf














Thanks,

Rajeswari

On Thu, Oct 17, 2019 at 8:16 AM Rajeswari Natarajan 
wrote:

> I tried below query  and it returns o results
>
>
> http://localhost:8983/solr/knr/export?{!terms+f%3Dproduct_s}product1=false=basket_s,product_s=basket_s+asc,product_s+asc=json=2.2
> 
>
>
> {
>   "responseHeader":{"status":0},
>   "response":{
> "numFound":0,
> "docs":[]}}
>
> Regards,
> Rajeswari
> On Thu, Oct 17, 2019 at 8:05 AM Rajeswari Natarajan 
> wrote:
>
>> Thanks Joel.
>>
>> Here is the logs for below request
>>
>> curl --data-urlencode
>> 'expr=gatherNodes(knr,walk="product1->product_s",gather="basket_s")'
>> http://localhost:8983/solr/knr/stream
>>
>> 2019-10-17 15:02:06.969 INFO  (qtp952486988-280) [c:knr s:shard1
>> r:core_node2 x:knr_shard1_replica_n1] o.a.s.c.S.Request
>> [knr_shard1_replica_n1]  webapp=/solr path=/stream
>> params={expr=gatherNodes(knr,walk%3D"product1->product_s",gather%3D"basket_s")}
>> status=0 QTime=0
>>
>> 2019-10-17 15:02:06.975 INFO  (qtp952486988-192) [c:knr s:shard1
>> r:core_node2 x:knr_shard1_replica_n1] o.a.s.c.S.Request
>> [knr_shard1_replica_n1]  webapp=/solr path=/export
>> params={q={!terms+f%3Dproduct_s}product1=false=off=basket_s,product_s=basket_s+asc,product_s+asc=json=2.2}
>> hits=0 status=0 QTime=1
>>
>>
>>
>> Here is the logs for
>>
>>
>>
>> curl --data-urlencode
>> 'expr=gatherNodes(knr,walk="product1->product_s",gather="basket_s",scatter="branches,
>> leaves")' http://localhost:8983/solr/knr/stream
>>
>>
>> 2019-10-17 15:03:57.068 INFO  (qtp952486988-356) [c:knr s:shard1
>> r:core_node2 x:knr_shard1_replica_n1] o.a.s.c.S.Request
>> [knr_shard1_replica_n1]  webapp=/solr path=/stream
>> params={expr=gatherNodes(knr,walk%3D"product1->product_s",gather%3D"basket_s",scatter%3D"branches,+leaves")}
>> status=0 QTime=0
>>
>> 2019-10-17 15:03:57.071 INFO  (qtp952486988-400) [c:knr s:shard1
>> r:core_node2 x:knr_shard1_replica_n1] o.a.s.c.S.Request
>> [knr_shard1_replica_n1]  webapp=/solr path=/export
>> params={q={!terms+f%3Dproduct_s}product1=false=off=basket_s,product_s=basket_s+asc,product_s+asc=json=2.2}
>> hits=0 status=0 QTime=0
>>
>>
>>
>>
>> Thank you,
>>
>> Rajeswari
>>
>> On Thu, Oct 17, 2019 at 5:23 AM Joel Bernstein 
>> wrote:
>>
>>> Can you show the logs from this request. There will be a Solr query that
>>> gets sent with product1 searched against the product_s field. Let's see
>>> how
>>> many documents that query returns.
>>>
>>>
>>> Joel Bernstein
>>> http://joelsolr.blogspot.com/
>>>
>>>
>>> On Thu, Oct 17, 2019 at 1:41 AM Rajeswari Natarajan 
>>> wrote:
>>>
>>> > Hi,
>>> >
>>> > Since the stream graph query for my use case , didn't work as  i took
>>> the
>>> > data from solr source code test and also copied the schema and
>>> > solrconfig.xml from solr 7.6 source code.  Had to substitute few
>>> variables.
>>> >
>>> > Posted below data
>>> >
>>> > curl -X POST http://localhost:8983/solr/knr/update -H
>>> > 'Content-type:text/csv' -d '
>>> > id, basket_s, product_s, prics_f
>>> > 90,basket1,product1,20
>>> > 91,basket1,product3,30
>>> > 92,basket1,product5,1
>>> > 93,basket2,product1,2
>>> > 94,basket2,product6,5
>>> > 95,basket2,product7,10
>>> > 96,basket3,product4,20
>>> > 97,basket3,product3,10
>>> > 98,basket3,product1,10
>>> > 99,basket4,product4,40
>>> > 110,basket4,product3,10
>>> > 111,basket4,product1,10'
>>> > After this I committed and made sure the data got published. to solr
>>> >
>>> > curl --data-urlencode
>>> > 'expr=gatherNodes(knr,walk="product1->product_s",gather="basket_s")'
>>> > http://localhost:8983/solr/knr/stream
>>> >
>>> > {
>>> >
>>> >   "result-set":{
>>> >
>>> > "docs":[{
>>> >
>>> > "EOF":true,
>>> >
>>> > "RESPONSE_TIME":4}]}}
>>> >
>>> >
>>> > and if I add *scatter="branches, leaves" , there is one doc.*
>>> >
>>> >
>>> >
>>> > curl --data-urlencode
>>> >
>>> >
>>> 'expr=gatherNodes(knr,walk="product1->product_s",gather="basket_s",scatter="branches,
>>> > leaves")' http://localhost:8983/solr/knr/stream
>>> >
>>> > {
>>> >
>>> >   "result-set":{
>>> >
>>> > "docs":[{
>>> >
>>> > "node":"product1",
>>> >
>>> > "collection":"knr",
>>> >
>>> > "field":"node",
>>> >
>>> > "level":0}
>>> >
>>> >   ,{
>>> >
>>> > "EOF":true,
>>> >
>>> > "RESPONSE_TIME":4}]}}
>>> >
>>> >
>>> >
>>> >
>>> > Below is the data I got from
>>> >
>>> >
>>> https://github.com/apache/lucene-solr/blob/branch_7_6/solr/solrj/src/test/org/apache/solr/client/solrj/io/graph/GraphExpressionTest.java#L271
>>> >
>>> >
>>> >
>>> > According to this test 4 docs are expected.
>>> >
>>> >
>>> > I am not sure what I am missing. Any pointers, please
>>> >
>>> >
>>> > Thanks you,
>>> >
>>> > Rajeswari
>>> >

Re: Help with Stream Graph

2019-10-17 Thread Rajeswari Natarajan
I tried below query  and it returns o results

http://localhost:8983/solr/knr/export?{!terms+f%3Dproduct_s}product1=false=basket_s,product_s=basket_s+asc,product_s+asc=json=2.2


{
  "responseHeader":{"status":0},
  "response":{
"numFound":0,
"docs":[]}}

Regards,
Rajeswari
On Thu, Oct 17, 2019 at 8:05 AM Rajeswari Natarajan 
wrote:

> Thanks Joel.
>
> Here is the logs for below request
>
> curl --data-urlencode
> 'expr=gatherNodes(knr,walk="product1->product_s",gather="basket_s")'
> http://localhost:8983/solr/knr/stream
>
> 2019-10-17 15:02:06.969 INFO  (qtp952486988-280) [c:knr s:shard1
> r:core_node2 x:knr_shard1_replica_n1] o.a.s.c.S.Request
> [knr_shard1_replica_n1]  webapp=/solr path=/stream
> params={expr=gatherNodes(knr,walk%3D"product1->product_s",gather%3D"basket_s")}
> status=0 QTime=0
>
> 2019-10-17 15:02:06.975 INFO  (qtp952486988-192) [c:knr s:shard1
> r:core_node2 x:knr_shard1_replica_n1] o.a.s.c.S.Request
> [knr_shard1_replica_n1]  webapp=/solr path=/export
> params={q={!terms+f%3Dproduct_s}product1=false=off=basket_s,product_s=basket_s+asc,product_s+asc=json=2.2}
> hits=0 status=0 QTime=1
>
>
>
> Here is the logs for
>
>
>
> curl --data-urlencode
> 'expr=gatherNodes(knr,walk="product1->product_s",gather="basket_s",scatter="branches,
> leaves")' http://localhost:8983/solr/knr/stream
>
>
> 2019-10-17 15:03:57.068 INFO  (qtp952486988-356) [c:knr s:shard1
> r:core_node2 x:knr_shard1_replica_n1] o.a.s.c.S.Request
> [knr_shard1_replica_n1]  webapp=/solr path=/stream
> params={expr=gatherNodes(knr,walk%3D"product1->product_s",gather%3D"basket_s",scatter%3D"branches,+leaves")}
> status=0 QTime=0
>
> 2019-10-17 15:03:57.071 INFO  (qtp952486988-400) [c:knr s:shard1
> r:core_node2 x:knr_shard1_replica_n1] o.a.s.c.S.Request
> [knr_shard1_replica_n1]  webapp=/solr path=/export
> params={q={!terms+f%3Dproduct_s}product1=false=off=basket_s,product_s=basket_s+asc,product_s+asc=json=2.2}
> hits=0 status=0 QTime=0
>
>
>
>
> Thank you,
>
> Rajeswari
>
> On Thu, Oct 17, 2019 at 5:23 AM Joel Bernstein  wrote:
>
>> Can you show the logs from this request. There will be a Solr query that
>> gets sent with product1 searched against the product_s field. Let's see
>> how
>> many documents that query returns.
>>
>>
>> Joel Bernstein
>> http://joelsolr.blogspot.com/
>>
>>
>> On Thu, Oct 17, 2019 at 1:41 AM Rajeswari Natarajan 
>> wrote:
>>
>> > Hi,
>> >
>> > Since the stream graph query for my use case , didn't work as  i took
>> the
>> > data from solr source code test and also copied the schema and
>> > solrconfig.xml from solr 7.6 source code.  Had to substitute few
>> variables.
>> >
>> > Posted below data
>> >
>> > curl -X POST http://localhost:8983/solr/knr/update -H
>> > 'Content-type:text/csv' -d '
>> > id, basket_s, product_s, prics_f
>> > 90,basket1,product1,20
>> > 91,basket1,product3,30
>> > 92,basket1,product5,1
>> > 93,basket2,product1,2
>> > 94,basket2,product6,5
>> > 95,basket2,product7,10
>> > 96,basket3,product4,20
>> > 97,basket3,product3,10
>> > 98,basket3,product1,10
>> > 99,basket4,product4,40
>> > 110,basket4,product3,10
>> > 111,basket4,product1,10'
>> > After this I committed and made sure the data got published. to solr
>> >
>> > curl --data-urlencode
>> > 'expr=gatherNodes(knr,walk="product1->product_s",gather="basket_s")'
>> > http://localhost:8983/solr/knr/stream
>> >
>> > {
>> >
>> >   "result-set":{
>> >
>> > "docs":[{
>> >
>> > "EOF":true,
>> >
>> > "RESPONSE_TIME":4}]}}
>> >
>> >
>> > and if I add *scatter="branches, leaves" , there is one doc.*
>> >
>> >
>> >
>> > curl --data-urlencode
>> >
>> >
>> 'expr=gatherNodes(knr,walk="product1->product_s",gather="basket_s",scatter="branches,
>> > leaves")' http://localhost:8983/solr/knr/stream
>> >
>> > {
>> >
>> >   "result-set":{
>> >
>> > "docs":[{
>> >
>> > "node":"product1",
>> >
>> > "collection":"knr",
>> >
>> > "field":"node",
>> >
>> > "level":0}
>> >
>> >   ,{
>> >
>> > "EOF":true,
>> >
>> > "RESPONSE_TIME":4}]}}
>> >
>> >
>> >
>> >
>> > Below is the data I got from
>> >
>> >
>> https://github.com/apache/lucene-solr/blob/branch_7_6/solr/solrj/src/test/org/apache/solr/client/solrj/io/graph/GraphExpressionTest.java#L271
>> >
>> >
>> >
>> > According to this test 4 docs are expected.
>> >
>> >
>> > I am not sure what I am missing. Any pointers, please
>> >
>> >
>> > Thanks you,
>> >
>> > Rajeswari
>> >
>>
>


Re: Help with Stream Graph

2019-10-17 Thread Rajeswari Natarajan
Thanks Joel.

Here is the logs for below request

curl --data-urlencode
'expr=gatherNodes(knr,walk="product1->product_s",gather="basket_s")'
http://localhost:8983/solr/knr/stream

2019-10-17 15:02:06.969 INFO  (qtp952486988-280) [c:knr s:shard1
r:core_node2 x:knr_shard1_replica_n1] o.a.s.c.S.Request
[knr_shard1_replica_n1]  webapp=/solr path=/stream
params={expr=gatherNodes(knr,walk%3D"product1->product_s",gather%3D"basket_s")}
status=0 QTime=0

2019-10-17 15:02:06.975 INFO  (qtp952486988-192) [c:knr s:shard1
r:core_node2 x:knr_shard1_replica_n1] o.a.s.c.S.Request
[knr_shard1_replica_n1]  webapp=/solr path=/export
params={q={!terms+f%3Dproduct_s}product1=false=off=basket_s,product_s=basket_s+asc,product_s+asc=json=2.2}
hits=0 status=0 QTime=1



Here is the logs for



curl --data-urlencode
'expr=gatherNodes(knr,walk="product1->product_s",gather="basket_s",scatter="branches,
leaves")' http://localhost:8983/solr/knr/stream


2019-10-17 15:03:57.068 INFO  (qtp952486988-356) [c:knr s:shard1
r:core_node2 x:knr_shard1_replica_n1] o.a.s.c.S.Request
[knr_shard1_replica_n1]  webapp=/solr path=/stream
params={expr=gatherNodes(knr,walk%3D"product1->product_s",gather%3D"basket_s",scatter%3D"branches,+leaves")}
status=0 QTime=0

2019-10-17 15:03:57.071 INFO  (qtp952486988-400) [c:knr s:shard1
r:core_node2 x:knr_shard1_replica_n1] o.a.s.c.S.Request
[knr_shard1_replica_n1]  webapp=/solr path=/export
params={q={!terms+f%3Dproduct_s}product1=false=off=basket_s,product_s=basket_s+asc,product_s+asc=json=2.2}
hits=0 status=0 QTime=0




Thank you,

Rajeswari

On Thu, Oct 17, 2019 at 5:23 AM Joel Bernstein  wrote:

> Can you show the logs from this request. There will be a Solr query that
> gets sent with product1 searched against the product_s field. Let's see how
> many documents that query returns.
>
>
> Joel Bernstein
> http://joelsolr.blogspot.com/
>
>
> On Thu, Oct 17, 2019 at 1:41 AM Rajeswari Natarajan 
> wrote:
>
> > Hi,
> >
> > Since the stream graph query for my use case , didn't work as  i took the
> > data from solr source code test and also copied the schema and
> > solrconfig.xml from solr 7.6 source code.  Had to substitute few
> variables.
> >
> > Posted below data
> >
> > curl -X POST http://localhost:8983/solr/knr/update -H
> > 'Content-type:text/csv' -d '
> > id, basket_s, product_s, prics_f
> > 90,basket1,product1,20
> > 91,basket1,product3,30
> > 92,basket1,product5,1
> > 93,basket2,product1,2
> > 94,basket2,product6,5
> > 95,basket2,product7,10
> > 96,basket3,product4,20
> > 97,basket3,product3,10
> > 98,basket3,product1,10
> > 99,basket4,product4,40
> > 110,basket4,product3,10
> > 111,basket4,product1,10'
> > After this I committed and made sure the data got published. to solr
> >
> > curl --data-urlencode
> > 'expr=gatherNodes(knr,walk="product1->product_s",gather="basket_s")'
> > http://localhost:8983/solr/knr/stream
> >
> > {
> >
> >   "result-set":{
> >
> > "docs":[{
> >
> > "EOF":true,
> >
> > "RESPONSE_TIME":4}]}}
> >
> >
> > and if I add *scatter="branches, leaves" , there is one doc.*
> >
> >
> >
> > curl --data-urlencode
> >
> >
> 'expr=gatherNodes(knr,walk="product1->product_s",gather="basket_s",scatter="branches,
> > leaves")' http://localhost:8983/solr/knr/stream
> >
> > {
> >
> >   "result-set":{
> >
> > "docs":[{
> >
> > "node":"product1",
> >
> > "collection":"knr",
> >
> > "field":"node",
> >
> > "level":0}
> >
> >   ,{
> >
> > "EOF":true,
> >
> > "RESPONSE_TIME":4}]}}
> >
> >
> >
> >
> > Below is the data I got from
> >
> >
> https://github.com/apache/lucene-solr/blob/branch_7_6/solr/solrj/src/test/org/apache/solr/client/solrj/io/graph/GraphExpressionTest.java#L271
> >
> >
> >
> > According to this test 4 docs are expected.
> >
> >
> > I am not sure what I am missing. Any pointers, please
> >
> >
> > Thanks you,
> >
> > Rajeswari
> >
>


Re: Help with Stream Graph

2019-10-17 Thread Joel Bernstein
Can you show the logs from this request. There will be a Solr query that
gets sent with product1 searched against the product_s field. Let's see how
many documents that query returns.


Joel Bernstein
http://joelsolr.blogspot.com/


On Thu, Oct 17, 2019 at 1:41 AM Rajeswari Natarajan 
wrote:

> Hi,
>
> Since the stream graph query for my use case , didn't work as  i took the
> data from solr source code test and also copied the schema and
> solrconfig.xml from solr 7.6 source code.  Had to substitute few variables.
>
> Posted below data
>
> curl -X POST http://localhost:8983/solr/knr/update -H
> 'Content-type:text/csv' -d '
> id, basket_s, product_s, prics_f
> 90,basket1,product1,20
> 91,basket1,product3,30
> 92,basket1,product5,1
> 93,basket2,product1,2
> 94,basket2,product6,5
> 95,basket2,product7,10
> 96,basket3,product4,20
> 97,basket3,product3,10
> 98,basket3,product1,10
> 99,basket4,product4,40
> 110,basket4,product3,10
> 111,basket4,product1,10'
> After this I committed and made sure the data got published. to solr
>
> curl --data-urlencode
> 'expr=gatherNodes(knr,walk="product1->product_s",gather="basket_s")'
> http://localhost:8983/solr/knr/stream
>
> {
>
>   "result-set":{
>
> "docs":[{
>
> "EOF":true,
>
> "RESPONSE_TIME":4}]}}
>
>
> and if I add *scatter="branches, leaves" , there is one doc.*
>
>
>
> curl --data-urlencode
>
> 'expr=gatherNodes(knr,walk="product1->product_s",gather="basket_s",scatter="branches,
> leaves")' http://localhost:8983/solr/knr/stream
>
> {
>
>   "result-set":{
>
> "docs":[{
>
> "node":"product1",
>
> "collection":"knr",
>
> "field":"node",
>
> "level":0}
>
>   ,{
>
> "EOF":true,
>
> "RESPONSE_TIME":4}]}}
>
>
>
>
> Below is the data I got from
>
> https://github.com/apache/lucene-solr/blob/branch_7_6/solr/solrj/src/test/org/apache/solr/client/solrj/io/graph/GraphExpressionTest.java#L271
>
>
>
> According to this test 4 docs are expected.
>
>
> I am not sure what I am missing. Any pointers, please
>
>
> Thanks you,
>
> Rajeswari
>


Help with Stream Graph

2019-10-16 Thread Rajeswari Natarajan
Hi,

Since the stream graph query for my use case , didn't work as  i took the
data from solr source code test and also copied the schema and
solrconfig.xml from solr 7.6 source code.  Had to substitute few variables.

Posted below data

curl -X POST http://localhost:8983/solr/knr/update -H
'Content-type:text/csv' -d '
id, basket_s, product_s, prics_f
90,basket1,product1,20
91,basket1,product3,30
92,basket1,product5,1
93,basket2,product1,2
94,basket2,product6,5
95,basket2,product7,10
96,basket3,product4,20
97,basket3,product3,10
98,basket3,product1,10
99,basket4,product4,40
110,basket4,product3,10
111,basket4,product1,10'
After this I committed and made sure the data got published. to solr

curl --data-urlencode
'expr=gatherNodes(knr,walk="product1->product_s",gather="basket_s")'
http://localhost:8983/solr/knr/stream

{

  "result-set":{

"docs":[{

"EOF":true,

"RESPONSE_TIME":4}]}}


and if I add *scatter="branches, leaves" , there is one doc.*



curl --data-urlencode
'expr=gatherNodes(knr,walk="product1->product_s",gather="basket_s",scatter="branches,
leaves")' http://localhost:8983/solr/knr/stream

{

  "result-set":{

"docs":[{

"node":"product1",

"collection":"knr",

"field":"node",

"level":0}

  ,{

"EOF":true,

"RESPONSE_TIME":4}]}}




Below is the data I got from
https://github.com/apache/lucene-solr/blob/branch_7_6/solr/solrj/src/test/org/apache/solr/client/solrj/io/graph/GraphExpressionTest.java#L271



According to this test 4 docs are expected.


I am not sure what I am missing. Any pointers, please


Thanks you,

Rajeswari


Re: Need help with Solr Streaming query

2019-10-16 Thread Erick Erickson
The NOT operator isn’t a Boolean NOT, so it requires some care, Chris Hostetter 
wrote a good blog about that. Try

q=*:* -(:*c*

The query q=-something really isn’t valid syntax, but some query parsers help 
you out by silently putting the *:* in front of it. that’s not guaranteed 
across all parsers though.

Best,
Erick

> On Oct 16, 2019, at 8:14 AM, Prasenjit Sarkar  
> wrote:
> 
> Hi,
> 
> 
> I am facing issue while working with solr streamimg expression. I am using 
> /export for emiting tuples out of streaming query.Howver when I tried to use 
> not operator in solr query it is not working.The same is working with /select.
> 
> 
> Please find the below query 
> 
> 
> top(n=105,search(,qt="/export",q="-(: *c*) 
> ",fl="",sort=" asc"),sort=" asc")
> 
> 
> 
> In the above query the not operator q="-(: *c*)" is not 
> working with /export.However the same query works when I combine any postive 
> search criteria with the not expression like q="-(: *c*) AND 
> (: **)". Can you please help here. As running only not query 
> with /export should be a valid use case. I have also checked the solr logs 
> and found no errors when running the not query.The query is just not 
> returning any value and it is returning with no result very fast.
> 
> 
> 
> 
> 
> Regards,
> Prasenjit Sarkar
> 
> 
> Experience certainty. IT Services
> Business Solutions
> Consulting
> 
> =-=-=
> Notice: The information contained in this e-mail
> message and/or attachments to it may contain 
> confidential or privileged information. If you are 
> not the intended recipient, any dissemination, use, 
> review, distribution, printing or copying of the 
> information contained in this e-mail message 
> and/or attachments to it are strictly prohibited. If 
> you have received this communication in error, 
> please notify us by reply e-mail or telephone and 
> immediately and permanently delete the message 
> and any attachments. Thank you
> 
> 



Need help with Solr Streaming query

2019-10-16 Thread Prasenjit Sarkar
Hi,


I am facing issue while working with solr streamimg expression. I am using 
/export for emiting tuples out of streaming query.Howver when I tried to use 
not operator in solr query it is not working.The same is working with /select.


Please find the below query 


top(n=105,search(,qt="/export",q="-(: *c*) 
",fl="",sort=" asc"),sort=" asc")



In the above query the not operator q="-(: *c*)" is not 
working with /export.However the same query works when I combine any postive 
search criteria with the not expression like q="-(: *c*) AND 
(: **)". Can you please help here. As running only not query 
with /export should be a valid use case. I have also checked the solr logs and 
found no errors when running the not query.The query is just not returning any 
value and it is returning with no result very fast.





Regards,
Prasenjit Sarkar


Experience certainty. IT Services
Business Solutions
Consulting

=-=-=
Notice: The information contained in this e-mail
message and/or attachments to it may contain 
confidential or privileged information. If you are 
not the intended recipient, any dissemination, use, 
review, distribution, printing or copying of the 
information contained in this e-mail message 
and/or attachments to it are strictly prohibited. If 
you have received this communication in error, 
please notify us by reply e-mail or telephone and 
immediately and permanently delete the message 
and any attachments. Thank you




Re: Re: Need urgent help with Solr spatial search using SpatialRecursivePrefixTreeFieldType

2019-10-01 Thread David Smiley
Do you know how URLs are structured?  They include name=value pairs
separated by ampersands.  This takes precedence over the contents of any
particular name or value.  Consequently looking at your parenthesis doesn't
make sense since the open-close span ampersands and thus go to different
filter queries.  I think you can completely remove those parenthesis in
fact.  Also try a tool like Postman to compose your queries rather than
direct URL manipulation.

=adminLatLon
=80
= {!geofilt pt=33.0198431,-96.6988856} OR {!geofilt
pt=50.2171726,8.265894}

Notice the leading space after 'fq'.  This is a syntax parsing gotcha that
has to do with how embedded queries are parsed, which is what you need to
do as you need to compose two with an operator.  It'd be kinda awkard to
fix that gotcha in Solr.  There are other techniques too, but this is the
most succinct.

~ David Smiley
Apache Lucene/Solr Search Developer
http://www.linkedin.com/in/davidwsmiley


On Tue, Oct 1, 2019 at 7:34 AM anushka gupta <
anushka_gu...@external.mckinsey.com> wrote:

> Thanks,
>
> Could you please help me in combining two geofilt fqs as the following
> gives
> error, it treats ")" as part of the d parameter and gives error that
> 'd=80)'
> is not a valid param:
>
>
> ({!geofilt}=adminLatLon=33.0198431,-96.6988856=80)+OR+({!geofilt}=adminLatLon=50.2171726,8.265894=80)
>
>
>
> --
> Sent from: https://lucene.472066.n3.nabble.com/Solr-User-f472068.html
>


Re: Re: Need urgent help with Solr spatial search using SpatialRecursivePrefixTreeFieldType

2019-10-01 Thread anushka gupta
Thanks, 

Could you please help me in combining two geofilt fqs as the following gives
error, it treats ")" as part of the d parameter and gives error that 'd=80)'
is not a valid param:

({!geofilt}=adminLatLon=33.0198431,-96.6988856=80)+OR+({!geofilt}=adminLatLon=50.2171726,8.265894=80)



--
Sent from: https://lucene.472066.n3.nabble.com/Solr-User-f472068.html


Re: Re: Need urgent help with Solr spatial search using SpatialRecursivePrefixTreeFieldType

2019-09-30 Thread David Smiley
"sort" is a regular request parameter.  In your non-working query, you
specified it as a local-param inside geofilt which isn't where it belongs.
If you want to sort from two points then you need to make up your mind on
how to combine the distances into some greater aggregate function (e.g.
min/max/sum).

~ David Smiley
Apache Lucene/Solr Search Developer
http://www.linkedin.com/in/davidwsmiley


On Mon, Sep 30, 2019 at 10:22 AM Anushka Gupta <
anushka_gu...@external.mckinsey.com> wrote:

> Hi,
>
>
>
> I want to be able to filter on different cities and also sort the results
> based on geoproximity. But sorting doesn’t work:
>
>
>
>
> admin_directory_search_geolocation?q=david=({!geofilt+sfield=adminLatLon+pt=33.0198431,-96.6988856+d=80+sort=min(geodist(33.0198431,-96.6988856))})+OR+({!geofilt+sfield=adminLatLon+pt=50.2171726,8.265894+d=80+sort=min(geodist(50.2171726,8.265894))})
>
>
>
> Sorting works fine if I add ‘&’ in geofilt condition like :
> q=david={!geofilt=adminLatLon=33.0198431,-96.6988856=80=geodist(33.0198431,-96.6988856)}
>
>
>
> But when I combine the two FQs then sorting doesn’t work.
>
>
>
> Please help.
>
>
>
>
>
> Best regards,
>
> Anushka gupta
>
>
>
>
>
>
>
> *From:* David Smiley 
> *Sent:* Friday, September 13, 2019 10:29 PM
> *To:* Anushka Gupta 
> *Subject:* [EXT]Re: Need urgent help with Solr spatial search using
> SpatialRecursivePrefixTreeFieldType
>
>
>
> Hello,
>
>
>
> Please don't email me directly for public help.  CC is okay if you send it
> to solr-user@lucene.apache.org so that the Solr community can benefit
> from my answer or might even answer it.
>
>
> ~ David Smiley
>
> Apache Lucene/Solr Search Developer
>
> http://www.linkedin.com/in/davidwsmiley
> <https://urldefense.proofpoint.com/v2/url?u=http-3A__www.linkedin.com_in_davidwsmiley=DwMFaQ=yIH1_-b1hO27QV_BdDph9suDL0Jq0WcgndLmIuQXoms=0egJOuVVdmY5VQTw_S3m4bVez1r-U8nqqi6RYBxO6tTbzryrDHrFoJROJ8r-TqNc=ulu2-5V3TDOnVNfRRQusod6-FoJcdeAWu5gGB3owryU=Hv2uYeXnut3oi1ijHp14BJ09QIZzhEI-onwzhnQYB8I=>
>
>
>
>
>
> On Wed, Sep 11, 2019 at 11:27 AM Anushka Gupta <
> anushka_gu...@external.mckinsey.com> wrote:
>
> Hello David,
>
>
>
> I read a lot of articles of yours regarding Solr spatial search using
> SpatialRecursivePrefixTreeFieldType. But unfortunately it doesn’t work for
> me when I combine filter query with my keyword search.
>
>
>
> Solr Version used : Solr 7.1.0
>
>
>
> I have declared fields as :
>
>
>
>  class="solr.SpatialRecursivePrefixTreeFieldType" geo="true"
> maxDistErr="0.001"
>
> distErrPct="0.025"
> distanceUnits="kilometers"/>
>
>
>
>  stored="true"  multiValued="true" />
>
>
>
>
>
> Field values are populated like :
>
> adminLatLon: [50.2171726,8.265894]
>
>
>
> Query is :
>
>
> /solr/ac3_persons/admin_directory_search_location?q=Idstein=Idstein={!geofilt%20cache=false%20cost=100}=adminLatLon=50.2171726,8.265894=500=recip(geodist(),2,200,20)=true
>
>
>
> My request handler is :
>
> admin_directory_search_location
>
>
>
> I get results if I do :
>
> /solr/ac3_persons/admin_directory_search_location?q=*:*
> =Idstein={!geofilt%20cache=false%20cost=100}=adminLatLon=50.2171726,8.265894=500=recip(geodist(),2,200,20)=true
>
>
>
> But I do not get results when I add any keyword in q.
>
>
>
> I am stuck in this issue since last many days. Could you please help with
> the same.
>
>
>
>
>
> Thanks,
>
> Anushka Gupta
>
>
>
> ++
> This email is confidential and may be privileged. If you have received it
> in error, please notify us immediately and then delete it. Please do not
> copy it, disclose its contents or use it for any purpose.
> ++
>
> ++
> This email is confidential and may be privileged. If you have received it
> in error, please notify us immediately and then delete it. Please do not
> copy it, disclose its contents or use it for any purpose.
> ++
>


Re: Re: Need urgent help with Solr spatial search using SpatialRecursivePrefixTreeFieldType

2019-09-30 Thread Tim Casey
https://stackoverflow.com/questions/48348312/solr-7-how-to-do-full-text-search-w-geo-spatial-search


On Mon, Sep 30, 2019 at 10:31 AM Anushka Gupta <
anushka_gu...@external.mckinsey.com> wrote:

> Hi,
>
> I want to be able to filter on different cities and also sort the results
> based on geoproximity. But sorting doesn’t work:
>
>
> admin_directory_search_geolocation?q=david=({!geofilt+sfield=adminLatLon+pt=33.0198431,-96.6988856+d=80+sort=min(geodist(33.0198431,-96.6988856))})+OR+({!geofilt+sfield=adminLatLon+pt=50.2171726,8.265894+d=80+sort=min(geodist(50.2171726,8.265894))})
>
> Sorting works fine if I add ‘&’ in geofilt condition like :
> q=david={!geofilt=adminLatLon=33.0198431,-96.6988856=80=geodist(33.0198431,-96.6988856)}
>
> But when I combine the two FQs then sorting doesn’t work.
>
> Please help.
>
>
> Best regards,
> Anushka gupta
>
>
>
> From: David Smiley 
> Sent: Friday, September 13, 2019 10:29 PM
> To: Anushka Gupta 
> Subject: [EXT]Re: Need urgent help with Solr spatial search using
> SpatialRecursivePrefixTreeFieldType
>
> Hello,
>
> Please don't email me directly for public help.  CC is okay if you send it
> to solr-user@lucene.apache.org<mailto:solr-user@lucene.apache.org> so
> that the Solr community can benefit from my answer or might even answer it.
>
> ~ David Smiley
> Apache Lucene/Solr Search Developer
> http://www.linkedin.com/in/davidwsmiley<
> https://urldefense.proofpoint.com/v2/url?u=http-3A__www.linkedin.com_in_davidwsmiley=DwMFaQ=yIH1_-b1hO27QV_BdDph9suDL0Jq0WcgndLmIuQXoms=0egJOuVVdmY5VQTw_S3m4bVez1r-U8nqqi6RYBxO6tTbzryrDHrFoJROJ8r-TqNc=ulu2-5V3TDOnVNfRRQusod6-FoJcdeAWu5gGB3owryU=Hv2uYeXnut3oi1ijHp14BJ09QIZzhEI-onwzhnQYB8I=
> >
>
>
> On Wed, Sep 11, 2019 at 11:27 AM Anushka Gupta <
> anushka_gu...@external.mckinsey.com anushka_gu...@external.mckinsey.com>> wrote:
> Hello David,
>
> I read a lot of articles of yours regarding Solr spatial search using
> SpatialRecursivePrefixTreeFieldType. But unfortunately it doesn’t work for
> me when I combine filter query with my keyword search.
>
> Solr Version used : Solr 7.1.0
>
> I have declared fields as :
>
>  class="solr.SpatialRecursivePrefixTreeFieldType" geo="true"
> maxDistErr="0.001"
> distErrPct="0.025"
> distanceUnits="kilometers"/>
>
>  stored="true"  multiValued="true" />
>
>
> Field values are populated like :
> adminLatLon: [50.2171726,8.265894]
>
> Query is :
>
> /solr/ac3_persons/admin_directory_search_location?q=Idstein=Idstein={!geofilt%20cache=false%20cost=100}=adminLatLon=50.2171726,8.265894=500=recip(geodist(),2,200,20)=true
>
> My request handler is :
> admin_directory_search_location
>
> I get results if I do :
>
> /solr/ac3_persons/admin_directory_search_location?q=*:*=Idstein={!geofilt%20cache=false%20cost=100}=adminLatLon=50.2171726,8.265894=500=recip(geodist(),2,200,20)=true
>
> But I do not get results when I add any keyword in q.
>
> I am stuck in this issue since last many days. Could you please help with
> the same.
>
>
> Thanks,
> Anushka Gupta
>
> ++
> This email is confidential and may be privileged. If you have received it
> in error, please notify us immediately and then delete it. Please do not
> copy it, disclose its contents or use it for any purpose.
> ++
>
> ++
> This email is confidential and may be privileged. If you have received it
> in error, please notify us immediately and then delete it.  Please do not
> copy it, disclose its contents or use it for any purpose.
> ++
>


RE: Re: Need urgent help with Solr spatial search using SpatialRecursivePrefixTreeFieldType

2019-09-30 Thread Anushka Gupta
Hi,

I want to be able to filter on different cities and also sort the results based 
on geoproximity. But sorting doesn’t work:

admin_directory_search_geolocation?q=david=({!geofilt+sfield=adminLatLon+pt=33.0198431,-96.6988856+d=80+sort=min(geodist(33.0198431,-96.6988856))})+OR+({!geofilt+sfield=adminLatLon+pt=50.2171726,8.265894+d=80+sort=min(geodist(50.2171726,8.265894))})

Sorting works fine if I add ‘&’ in geofilt condition like : 
q=david={!geofilt=adminLatLon=33.0198431,-96.6988856=80=geodist(33.0198431,-96.6988856)}

But when I combine the two FQs then sorting doesn’t work.

Please help.


Best regards,
Anushka gupta



From: David Smiley 
Sent: Friday, September 13, 2019 10:29 PM
To: Anushka Gupta 
Subject: [EXT]Re: Need urgent help with Solr spatial search using 
SpatialRecursivePrefixTreeFieldType

Hello,

Please don't email me directly for public help.  CC is okay if you send it to 
solr-user@lucene.apache.org<mailto:solr-user@lucene.apache.org> so that the 
Solr community can benefit from my answer or might even answer it.

~ David Smiley
Apache Lucene/Solr Search Developer
http://www.linkedin.com/in/davidwsmiley<https://urldefense.proofpoint.com/v2/url?u=http-3A__www.linkedin.com_in_davidwsmiley=DwMFaQ=yIH1_-b1hO27QV_BdDph9suDL0Jq0WcgndLmIuQXoms=0egJOuVVdmY5VQTw_S3m4bVez1r-U8nqqi6RYBxO6tTbzryrDHrFoJROJ8r-TqNc=ulu2-5V3TDOnVNfRRQusod6-FoJcdeAWu5gGB3owryU=Hv2uYeXnut3oi1ijHp14BJ09QIZzhEI-onwzhnQYB8I=>


On Wed, Sep 11, 2019 at 11:27 AM Anushka Gupta 
mailto:anushka_gu...@external.mckinsey.com>>
 wrote:
Hello David,

I read a lot of articles of yours regarding Solr spatial search using 
SpatialRecursivePrefixTreeFieldType. But unfortunately it doesn’t work for me 
when I combine filter query with my keyword search.

Solr Version used : Solr 7.1.0

I have declared fields as :






Field values are populated like :
adminLatLon: [50.2171726,8.265894]

Query is :
/solr/ac3_persons/admin_directory_search_location?q=Idstein=Idstein={!geofilt%20cache=false%20cost=100}=adminLatLon=50.2171726,8.265894=500=recip(geodist(),2,200,20)=true

My request handler is :
admin_directory_search_location

I get results if I do :
/solr/ac3_persons/admin_directory_search_location?q=*:*=Idstein={!geofilt%20cache=false%20cost=100}=adminLatLon=50.2171726,8.265894=500=recip(geodist(),2,200,20)=true

But I do not get results when I add any keyword in q.

I am stuck in this issue since last many days. Could you please help with the 
same.


Thanks,
Anushka Gupta

++
This email is confidential and may be privileged. If you have received it
in error, please notify us immediately and then delete it. Please do not
copy it, disclose its contents or use it for any purpose.
++

++
This email is confidential and may be privileged. If you have received it
in error, please notify us immediately and then delete it.  Please do not
copy it, disclose its contents or use it for any purpose.
++


[JOB] remote job at Help Scout

2019-09-26 Thread Leah Knobler
Hey all!

Help Scout
<https://t.yesware.com/tt/5c3e6e41ed343ce324b9d063a065f0f54b1dbc48/a7b6c6ce29fdcfd3033c830348fdf63e/5ef7f23b853bcf4180b332337995eaa9/www.helpscout.com/>,
a 100 person remote company that builds helpful customer messaging tools,
is looking for a Java Data Engineer
<https://t.yesware.com/tt/5c3e6e41ed343ce324b9d063a065f0f54b1dbc48/a7b6c6ce29fdcfd3033c830348fdf63e/ccb92296e4c83fc160e0ee4dc8aeb4c6/jobs.lever.co/helpscout/87acb031-e8f1-4c21-9f4a-2488aab1a6c9>to
join our Search team. We are looking to hire someone who relishes designing
and building systems and services that can manage large data sets with a
high transaction volume that are scaling constantly to meet customer
demand. The ideal person takes pride in building coherent and usable
interfaces making it easy to use and operate on data. This role would allow
you to take on challenging problems, choose the right tools for the job and
build elegant, scalable solutions.

If this sounds like a fit, please apply
<https://t.yesware.com/tt/5c3e6e41ed343ce324b9d063a065f0f54b1dbc48/a7b6c6ce29fdcfd3033c830348fdf63e/43ce129441bc2214326e83c493fa21ca/jobs.lever.co/helpscout/87acb031-e8f1-4c21-9f4a-2488aab1a6c9>
and feel free to reach out to me. Thanks!

Leah


Re: Can you help with this JOIN and OR query?

2019-09-11 Thread Mikhail Khludnev
Hello, James.
Right. Syntax is cumbersome

q=articledate:[2018-09-04T00:00:00Z TO 2019-09-10T23:59:59Z] {!join to=id
from=url v=$param}=articledate:[2018-09-04T12:00:00Z TO
2019-09-10T11:59:59Z])

On Wed, Sep 11, 2019 at 9:39 AM Smith2, James
 wrote:

> Hi there,
>
> I was hoping that you may be able to assist us with a search issue we're
> facing.
>
> Each one of these queries work on their own:
> articledate:[2018-09-04T00:00:00Z TO 2019-09-10T23:59:59Z]
>
> {!join to=id from=url}articledate:[2018-09-04T12:00:00Z TO
> 2019-09-10T11:59:59Z])
>
> But if we try and combine them in to an or statement:
>
> (articledate:[2018-09-04T00:00:00Z TO 2019-09-10T23:59:59Z] OR {!join
> to=id from=url}articledate:[2018-09-04T12:00:00Z TO 2019-09-10T11:59:59Z])
>
> Then we get a parser error.  The odd thing is that in the second clause,
> if we search for single date then it works, so we were thinking that we had
> the scope of the parser correct.
>
> If you could provide some guidance it'd be most appreciated.,
>
> Kind regards,
>
> James Smith
>
>
>
> IMPORTANT NOTICE:  This email and any attachments may contain information
> that is confidential and privileged. It is intended to be received only by
> persons entitled to receive the information. If you are not the intended
> recipient, please delete it from your system and notify the sender. You
> should not copy it or use it for any purpose nor disclose or distribute its
> contents to any other person.
>
>

-- 
Sincerely yours
Mikhail Khludnev


Can you help with this JOIN and OR query?

2019-09-11 Thread Smith2, James
Hi there,

I was hoping that you may be able to assist us with a search issue we're facing.

Each one of these queries work on their own:
articledate:[2018-09-04T00:00:00Z TO 2019-09-10T23:59:59Z]

{!join to=id from=url}articledate:[2018-09-04T12:00:00Z TO 
2019-09-10T11:59:59Z])

But if we try and combine them in to an or statement:

(articledate:[2018-09-04T00:00:00Z TO 2019-09-10T23:59:59Z] OR {!join to=id 
from=url}articledate:[2018-09-04T12:00:00Z TO 2019-09-10T11:59:59Z])

Then we get a parser error.  The odd thing is that in the second clause, if we 
search for single date then it works, so we were thinking that we had the scope 
of the parser correct.

If you could provide some guidance it'd be most appreciated.,

Kind regards,

James Smith



IMPORTANT NOTICE:  This email and any attachments may contain information that 
is confidential and privileged. It is intended to be received only by persons 
entitled to receive the information. If you are not the intended recipient, 
please delete it from your system and notify the sender. You should not copy it 
or use it for any purpose nor disclose or distribute its contents to any other 
person.



Need help | NoNodeException | Could not read DIH properties

2019-09-05 Thread Pal Sumit
Hi,

I am getting the below log very frequently and I can't find more details 
about it.

ZKPropertiesWriter  Could not read DIH properties from 
/configs//dataimport.properties :class 
org.apache.zookeeper.KeeperException$NoNodeException

Details:

We have a Solr cluster containing 2 Solr nodes and 2 replica. There are 2 
servers where the nodes and replicas are deployed. So,

server1 has node1 & node2_replica
server2 has node2 & node1_replica.

In another server we have deployed a single instance of Zookeeper (I would 
consider it make a Zookeeper ensemble). 
All the servers are in AWS auto-scaling so if any one is down another one 
pops up and the Solr/Zookeeper gets installed in there.

We are creating Solr cores using the below API:
http:///
admin/collections?action=CREATE==json

We are feeding data into the Solr  cluster using a batch job using the 
POST API  http:///solr/collection/update?commit=true=json.
We do not use Data Import Handler to import data. But still we are getting 
the exception. 

In one post  I have read that partial import using DIH can cause the 
problem but I am not using DIH here.

While querying data using the below API no result is returning (previously 
it was returning data).
https://SolrHostname/solr//select?q=(Composition)=edismax=title^2
 
author impn source=100=json

It would be a great help if someone provide me a clue. Please let me know 
if more details are needed to analyze.


Thanks & Regards
Sumit Pal
=-=-=
Notice: The information contained in this e-mail
message and/or attachments to it may contain 
confidential or privileged information. If you are 
not the intended recipient, any dissemination, use, 
review, distribution, printing or copying of the 
information contained in this e-mail message 
and/or attachments to it are strictly prohibited. If 
you have received this communication in error, 
please notify us by reply e-mail or telephone and 
immediately and permanently delete the message 
and any attachments. Thank you




Re: Basic Query Not Working - Please Help

2019-07-30 Thread Furkan KAMACI
Hi Vipul,

You are welcome!

Kind Regards,
Furkan KAMACI

On Fri, Jul 26, 2019 at 11:07 AM Vipul Bahuguna <
newthings4learn...@gmail.com> wrote:

> Hi Furkan -
>
> I realized that I was searching incorrectly.
> I later realized that if I need to search by specific field, I need to do
> as you suggested -
> q=appname:App1 .
>
> OR if need to simply search by App1, then I need to use  to
> index my field appname at the time of insertion so that it can be later
> search without specifying the fieldname.
>
> thanks for your response.
>
> On Tue, Jul 23, 2019 at 6:07 AM Furkan KAMACI 
> wrote:
>
> > Hi Vipul,
> >
> > Which query do you submit? Is that one:
> >
> > q=appname:App1
> >
> > Kind Regards,
> > Furkan KAMACI
> >
> > On Mon, Jul 22, 2019 at 10:52 AM Vipul Bahuguna <
> > newthings4learn...@gmail.com> wrote:
> >
> > > Hi,
> > >
> > > I have installed SOLR 8.1.1.
> > > I am new and trying the very basics.
> > >
> > > I installed solr8.1.1 on Windows and I am using SOLR in standalone
> mode.
> > >
> > > Steps I followed -
> > >
> > > 1. created a core as follows:
> > > solr create_core -c dox
> > >
> > > 2. updated the managed_schema.xml file to add few specific fields
> > specific
> > > to my schema as belows:
> > >
> > >  stored="true"/>
> > >  stored="true"/>
> > >  > stored="true"/>
> > >  > > stored="true"/>
> > >
> > > 3. then i restarted SOLR
> > >
> > > 4. then i went to the Documents tab to enter my sample data for
> indexing,
> > > which looks like below:
> > > {
> > >
> > >   "id" : "1",
> > >   "prjname" : "Project1",
> > >   "apps" : [
> > > {
> > >   "appname" : "App1",
> > >   "topics" : [
> > > {
> > >   "topicname" : "topic1",
> > >   "links" : [
> > > "http://www.google.com;,
> > > "http://www.t6.com;
> > >   ]
> > > },
> > > {
> > >   "topicname" : "topic2",
> > >   "links" : [
> > > "http://www.java.com;,
> > > "http://www.rediff.com;
> > >   ]
> > > }
> > >   ]
> > > },
> > > {
> > >   "appname" : "App2",
> > >   "topics" : [
> > > {
> > >   "topicname" : "topic3",
> > >   "links" : [
> > > "http://www.t3.com;,
> > > "http://www.t4.com;
> > >   ]
> > > },
> > > {
> > >   "topicname" : "topic4",
> > >   "links" : [
> > > "http://www.rules.com;,
> > > "http://www.amazon.com;
> > >   ]
> > > }
> > >   ]
> > > }
> > >   ]
> > > }
> > >
> > > 5. Now when i go to Query tab and click Execute Search with *.*, it
> shows
> > > my recently added document as follows:
> > > {
> > > "responseHeader":{ "status":0, "QTime":0, "params":{ "q":"*:*", "_":
> > > "1563780352100"}}, "response":{"numFound":1,"start":0,"docs":[ {
> > "id":"1",
> > > "
> > > prjname":["Project1"], "apps":["{appname=App1,
> topics=[{topicname=topic1,
> > > links=[http://www.google.com, http://www.t6.com]}, {topicname=topic2,
> > > links=[http://www.java.com, http://www.rediff.com]}]};,
> "{appname=App2,
> > > topics=[{topicname=topic3, links=[http://www.t3.com, http://www.t4.com
> > ]},
> > > {topicname=topic4, links=[http://www.rules.com, http://www.amazon.com
> > > ]}]}"],
> > > "_version_":1639742305772503040}] }}
> > >
> > > 6. But now when I am trying to search based on field topicname or
> > prjname,
> > > it does not returns any document. Even if put anything in q like App1,
> > zero
> > > results are being returned.
> > >
> > >
> > > Can someone help me understanding what I might have done incorrectly?
> > > May be I defined my schema incorrectly.
> > >
> > > Thanks in advance
> > >
> >
>


Re: Basic Query Not Working - Please Help

2019-07-26 Thread Vipul Bahuguna
Hi Furkan -

I realized that I was searching incorrectly.
I later realized that if I need to search by specific field, I need to do
as you suggested -
q=appname:App1 .

OR if need to simply search by App1, then I need to use  to
index my field appname at the time of insertion so that it can be later
search without specifying the fieldname.

thanks for your response.

On Tue, Jul 23, 2019 at 6:07 AM Furkan KAMACI 
wrote:

> Hi Vipul,
>
> Which query do you submit? Is that one:
>
> q=appname:App1
>
> Kind Regards,
> Furkan KAMACI
>
> On Mon, Jul 22, 2019 at 10:52 AM Vipul Bahuguna <
> newthings4learn...@gmail.com> wrote:
>
> > Hi,
> >
> > I have installed SOLR 8.1.1.
> > I am new and trying the very basics.
> >
> > I installed solr8.1.1 on Windows and I am using SOLR in standalone mode.
> >
> > Steps I followed -
> >
> > 1. created a core as follows:
> > solr create_core -c dox
> >
> > 2. updated the managed_schema.xml file to add few specific fields
> specific
> > to my schema as belows:
> >
> > 
> > 
> >  stored="true"/>
> >  > stored="true"/>
> >
> > 3. then i restarted SOLR
> >
> > 4. then i went to the Documents tab to enter my sample data for indexing,
> > which looks like below:
> > {
> >
> >   "id" : "1",
> >   "prjname" : "Project1",
> >   "apps" : [
> > {
> >   "appname" : "App1",
> >   "topics" : [
> > {
> >   "topicname" : "topic1",
> >   "links" : [
> > "http://www.google.com;,
> > "http://www.t6.com;
> >   ]
> > },
> > {
> >   "topicname" : "topic2",
> >   "links" : [
> > "http://www.java.com;,
> > "http://www.rediff.com;
> >   ]
> > }
> >   ]
> > },
> > {
> >   "appname" : "App2",
> >   "topics" : [
> > {
> >   "topicname" : "topic3",
> >   "links" : [
> > "http://www.t3.com;,
> > "http://www.t4.com;
> >   ]
> > },
> > {
> >   "topicname" : "topic4",
> >   "links" : [
> > "http://www.rules.com;,
> > "http://www.amazon.com;
> >   ]
> > }
> >   ]
> > }
> >   ]
> > }
> >
> > 5. Now when i go to Query tab and click Execute Search with *.*, it shows
> > my recently added document as follows:
> > {
> > "responseHeader":{ "status":0, "QTime":0, "params":{ "q":"*:*", "_":
> > "1563780352100"}}, "response":{"numFound":1,"start":0,"docs":[ {
> "id":"1",
> > "
> > prjname":["Project1"], "apps":["{appname=App1, topics=[{topicname=topic1,
> > links=[http://www.google.com, http://www.t6.com]}, {topicname=topic2,
> > links=[http://www.java.com, http://www.rediff.com]}]};, "{appname=App2,
> > topics=[{topicname=topic3, links=[http://www.t3.com, http://www.t4.com
> ]},
> > {topicname=topic4, links=[http://www.rules.com, http://www.amazon.com
> > ]}]}"],
> > "_version_":1639742305772503040}] }}
> >
> > 6. But now when I am trying to search based on field topicname or
> prjname,
> > it does not returns any document. Even if put anything in q like App1,
> zero
> > results are being returned.
> >
> >
> > Can someone help me understanding what I might have done incorrectly?
> > May be I defined my schema incorrectly.
> >
> > Thanks in advance
> >
>


Re: Basic Query Not Working - Please Help

2019-07-22 Thread Furkan KAMACI
Hi Vipul,

Which query do you submit? Is that one:

q=appname:App1

Kind Regards,
Furkan KAMACI

On Mon, Jul 22, 2019 at 10:52 AM Vipul Bahuguna <
newthings4learn...@gmail.com> wrote:

> Hi,
>
> I have installed SOLR 8.1.1.
> I am new and trying the very basics.
>
> I installed solr8.1.1 on Windows and I am using SOLR in standalone mode.
>
> Steps I followed -
>
> 1. created a core as follows:
> solr create_core -c dox
>
> 2. updated the managed_schema.xml file to add few specific fields specific
> to my schema as belows:
>
> 
> 
> 
>  stored="true"/>
>
> 3. then i restarted SOLR
>
> 4. then i went to the Documents tab to enter my sample data for indexing,
> which looks like below:
> {
>
>   "id" : "1",
>   "prjname" : "Project1",
>   "apps" : [
> {
>   "appname" : "App1",
>   "topics" : [
> {
>   "topicname" : "topic1",
>   "links" : [
> "http://www.google.com;,
> "http://www.t6.com;
>   ]
> },
> {
>   "topicname" : "topic2",
>   "links" : [
> "http://www.java.com;,
> "http://www.rediff.com;
>   ]
> }
>   ]
> },
> {
>   "appname" : "App2",
>   "topics" : [
> {
>   "topicname" : "topic3",
>   "links" : [
> "http://www.t3.com;,
> "http://www.t4.com;
>   ]
> },
> {
>   "topicname" : "topic4",
>   "links" : [
> "http://www.rules.com;,
> "http://www.amazon.com;
>   ]
> }
>   ]
> }
>   ]
> }
>
> 5. Now when i go to Query tab and click Execute Search with *.*, it shows
> my recently added document as follows:
> {
> "responseHeader":{ "status":0, "QTime":0, "params":{ "q":"*:*", "_":
> "1563780352100"}}, "response":{"numFound":1,"start":0,"docs":[ { "id":"1",
> "
> prjname":["Project1"], "apps":["{appname=App1, topics=[{topicname=topic1,
> links=[http://www.google.com, http://www.t6.com]}, {topicname=topic2,
> links=[http://www.java.com, http://www.rediff.com]}]};, "{appname=App2,
> topics=[{topicname=topic3, links=[http://www.t3.com, http://www.t4.com]},
> {topicname=topic4, links=[http://www.rules.com, http://www.amazon.com
> ]}]}"],
> "_version_":1639742305772503040}] }}
>
> 6. But now when I am trying to search based on field topicname or prjname,
> it does not returns any document. Even if put anything in q like App1, zero
> results are being returned.
>
>
> Can someone help me understanding what I might have done incorrectly?
> May be I defined my schema incorrectly.
>
> Thanks in advance
>


Basic Query Not Working - Please Help

2019-07-22 Thread Vipul Bahuguna
Hi,

I have installed SOLR 8.1.1.
I am new and trying the very basics.

I installed solr8.1.1 on Windows and I am using SOLR in standalone mode.

Steps I followed -

1. created a core as follows:
solr create_core -c dox

2. updated the managed_schema.xml file to add few specific fields specific
to my schema as belows:






3. then i restarted SOLR

4. then i went to the Documents tab to enter my sample data for indexing,
which looks like below:
{

  "id" : "1",
  "prjname" : "Project1",
  "apps" : [
{
  "appname" : "App1",
  "topics" : [
{
  "topicname" : "topic1",
  "links" : [
"http://www.google.com;,
"http://www.t6.com;
  ]
},
{
  "topicname" : "topic2",
  "links" : [
"http://www.java.com;,
"http://www.rediff.com;
  ]
}
  ]
},
{
  "appname" : "App2",
  "topics" : [
{
  "topicname" : "topic3",
  "links" : [
"http://www.t3.com;,
"http://www.t4.com;
  ]
},
{
  "topicname" : "topic4",
  "links" : [
"http://www.rules.com;,
"http://www.amazon.com;
  ]
}
  ]
}
  ]
}

5. Now when i go to Query tab and click Execute Search with *.*, it shows
my recently added document as follows:
{
"responseHeader":{ "status":0, "QTime":0, "params":{ "q":"*:*", "_":
"1563780352100"}}, "response":{"numFound":1,"start":0,"docs":[ { "id":"1", "
prjname":["Project1"], "apps":["{appname=App1, topics=[{topicname=topic1,
links=[http://www.google.com, http://www.t6.com]}, {topicname=topic2,
links=[http://www.java.com, http://www.rediff.com]}]};, "{appname=App2,
topics=[{topicname=topic3, links=[http://www.t3.com, http://www.t4.com]},
{topicname=topic4, links=[http://www.rules.com, http://www.amazon.com]}]};],
"_version_":1639742305772503040}] }}

6. But now when I am trying to search based on field topicname or prjname,
it does not returns any document. Even if put anything in q like App1, zero
results are being returned.


Can someone help me understanding what I might have done incorrectly?
May be I defined my schema incorrectly.

Thanks in advance


Re: SOLR EofException help

2019-06-13 Thread ennio
Thanks for the information. I will check my server timeout to see what is
happening. That was very helpful.

Also thanks for pointing out the swap space memory allocation I will double
check here.

 



--
Sent from: http://lucene.472066.n3.nabble.com/Solr-User-f472068.html


Re: SOLR EofException help

2019-06-13 Thread Shawn Heisey

On 6/13/2019 7:30 AM, ennio wrote:

The server for most part runs fine, but when I look at the logs I see from
time to time the following error.

org.eclipse.jetty.io.EofException: Closed


Jetty's EofException is nearly always caused by a specific event:

The client talking to Solr closed the TCP/HTTP connection before Solr 
was done processing the request.  When Solr finally finished the request 
and tried to respond, Jetty found that it could not send the response, 
because the TCP connection was gone.


You'll need to adjust the timeouts on your client software so that it 
allows Solr more time to respond and doesn't close the connection too 
quickly.


Side note:  Java says your server is using 5GB of swap.  If that's an 
accurate value, it's usually an indication that the software on the 
system is allocating a lot more memory than the server has.  It also 
says that the machine is only using 3GB out of the 8GB available, so the 
over-allocation must be non-persistent... and is probably 
periodic/scheduled.


With an index as small as you have, 2GB of heap is probably more than 
you need.  You could likely reduce that to 1GB, maybe even less. 
Knowing for sure will require experimentation.


Thanks,
Shawn


SOLR EofException help

2019-06-13 Thread ennio
181114]
at
org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1247)
[jetty-server-9.4.14.v20181114.jar:9.4.14.v20181114]
at
org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:144)
[jetty-server-9.4.14.v20181114.jar:9.4.14.v20181114]
at
org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:220)
[jetty-server-9.4.14.v20181114.jar:9.4.14.v20181114]
at
org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerCollection.java:126)
[jetty-server-9.4.14.v20181114.jar:9.4.14.v20181114]
at
org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:132)
[jetty-server-9.4.14.v20181114.jar:9.4.14.v20181114]
at
org.eclipse.jetty.rewrite.handler.RewriteHandler.handle(RewriteHandler.java:335)
[jetty-rewrite-9.4.14.v20181114.jar:9.4.14.v20181114]
at
org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:132)
[jetty-server-9.4.14.v20181114.jar:9.4.14.v20181114]
at org.eclipse.jetty.server.Server.handle(Server.java:502)
[jetty-server-9.4.14.v20181114.jar:9.4.14.v20181114]
at org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.java:364)
[jetty-server-9.4.14.v20181114.jar:9.4.14.v20181114]
at
org.eclipse.jetty.server.HttpConnection.onFillable(HttpConnection.java:260)
[jetty-server-9.4.14.v20181114.jar:9.4.14.v20181114]
at
org.eclipse.jetty.io.AbstractConnection$ReadCallback.succeeded(AbstractConnection.java:305)
[jetty-io-9.4.14.v20181114.jar:9.4.14.v20181114]
at org.eclipse.jetty.io.FillInterest.fillable(FillInterest.java:103)
[jetty-io-9.4.14.v20181114.jar:9.4.14.v20181114]
at org.eclipse.jetty.io.ChannelEndPoint$2.run(ChannelEndPoint.java:118)
[jetty-io-9.4.14.v20181114.jar:9.4.14.v20181114]
at
org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.runTask(EatWhatYouKill.java:333)
[jetty-util-9.4.14.v20181114.jar:9.4.14.v20181114]
at
org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.doProduce(EatWhatYouKill.java:310)
[jetty-util-9.4.14.v20181114.jar:9.4.14.v20181114]
at
org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.tryProduce(EatWhatYouKill.java:168)
[jetty-util-9.4.14.v20181114.jar:9.4.14.v20181114]
at
org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.run(EatWhatYouKill.java:126)
[jetty-util-9.4.14.v20181114.jar:9.4.14.v20181114]
at
org.eclipse.jetty.util.thread.ReservedThreadExecutor$ReservedThread.run(ReservedThreadExecutor.java:366)
[jetty-util-9.4.14.v20181114.jar:9.4.14.v20181114]
at
org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:765)
[jetty-util-9.4.14.v20181114.jar:9.4.14.v20181114]
at
org.eclipse.jetty.util.thread.QueuedThreadPool$2.run(QueuedThreadPool.java:683)
[jetty-util-9.4.14.v20181114.jar:9.4.14.v20181114]
at java.lang.Thread.run(Unknown Source) [?:1.8.0_211]

Do I need to allocate more resources for the serve? Or is there anything
else to try to debug this error? Usually after the error SOLR will be slow,
and the next few queries would take a while to process but eventually it
comes back to a normal state. 

I'm not running SOLR cloud, it just a single instance of SOLR

Thanks for the help. 



--
Sent from: http://lucene.472066.n3.nabble.com/Solr-User-f472068.html


Re: Urgent help on solr optimisation issue !!

2019-06-07 Thread Erick Erickson
gt;> testing your index will show you what a good maximum
>>> amount of segments is for your index.
>>> 
>>>> On 7 Jun 2019, at 07:27, jena 
> 
>> sthita2010@
> 
>>  wrote:
>>>> 
>>>> Hello guys,
>>>> 
>>>> We have 4 solr(version 4.4) instance on production environment, which
>>>> are
>>>> linked/associated with zookeeper for replication. We do heavy deleted &
>>>> add
>>>> operations. We have around 26million records and the index size is
>>>> around
>>>> 70GB. We serve 100k+ requests per day.
>>>> 
>>>> 
>>>> Because of heavy indexing & deletion, we optimise solr instance
>>>> everyday,
>>>> because of that our solr cloud getting unstable , every solr instance go
>>>> on
>>>> recovery mode & our search is getting affected & very slow because of
>>>> that.
>>>> Optimisation takes around 1hr 30minutes.
>>>> We are not able fix this issue, please help.
>>>> 
>>>> Thanks & Regards
>>>> 
>>>> 
>>>> 
>>>> --
>>>> Sent from: http://lucene.472066.n3.nabble.com/Solr-User-f472068.html
> 
> 
> 
> 
> 
>--
>Sent from: http://lucene.472066.n3.nabble.com/Solr-User-f472068.html
> 



Re: Urgent help on solr optimisation issue !!

2019-06-07 Thread David Santamauro
I use the same algorithm and for me, initialMaxSegments is always the number of 
segments currently in the index (seen, e.g, in the SOLR admin UI). 
finalMaxSegments depends on what kind of updates have happened. If I know that 
"older" documents are untouched, then I'll usually use -60% or even -70%, 
depending on the initialMaxSegments. I have a few cores that I'll even go all 
the way down to 1.

If you are going to attempt this, I'd suggest to test with a small reduction, 
say 10 segments, and monitor the index size and difference between maxDoc and 
numDocs. I've shaved ~ 1T off of an index optimizing from 75 down to  30 
segments (7T index total) and reduced a significant % of delete documents in 
the process. YMMV ...

If you are using a version of SOLR >=7.5 (see LUCENE-7976), this might all be 
moot.

//


On 6/7/19, 2:29 PM, "jena"  wrote:

Thanks @Michael Joyner,  how did you decide initialmax segment to 256 ? Or 
it
is some random number i can use for my case ? Can you guuide me how to
decide the initial & final max segments ?

 
Michael Joyner wrote
> That is the way we do it here - also helps a lot with not needing x2 or 
> x3 disk space to handle the merge:
> 
> public void solrOptimize() {
>  int initialMaxSegments = 256;
>  int finalMaxSegments = 4;
>  if (isShowSegmentCounter()) {
>  log.info("Optimizing ...");
>  }
>  try (SolrClient solrServerInstance = getSolrClientInstance()) {
>  for (int segments = initialMaxSegments; segments >= 
> finalMaxSegments; segments--) {
>  if (isShowSegmentCounter()) {
>  System.out.println("Optimizing to a max of " + 
> segments + " segments.");
>  }
>  try {
>  solrServerInstance.optimize(true, true, segments);
>  } catch (RemoteSolrException | SolrServerException | 
> IOException e) {
>  log.severe(e.getMessage());
>  }
>  }
>  } catch (IOException e) {
>  throw new RuntimeException(e);
>  }
>  }
> 
> On 6/7/19 4:56 AM, Nicolas Franck wrote:
>> In that case, hard optimisation like that is out the question.
>> Resort to automatic merge policies, specifying a maximum
>> amount of segments. Solr is created with multiple segments
>> in mind. Hard optimisation seems like not worth the problem.
>>
>> The problem is this: the less segments you specify during
>> during an optimisation, the longer it will take, because it has to read
>> all of these segments to be merged, and redo the sorting. And a cluster
>> has a lot of housekeeping on top of it.
>>
>> If you really want to issue a optimisation, then you can
>> also do it in steps (max segments parameter)
>>
>> 10 -> 9 -> 8 -> 7 .. -> 1
>>
>> that way less segments need to be merged in one go.
>>
>> testing your index will show you what a good maximum
>> amount of segments is for your index.
>>
>>> On 7 Jun 2019, at 07:27, jena 

> sthita2010@

>  wrote:
>>>
>>> Hello guys,
>>>
>>> We have 4 solr(version 4.4) instance on production environment, which
>>> are
>>> linked/associated with zookeeper for replication. We do heavy deleted &
>>> add
>>> operations. We have around 26million records and the index size is
>>> around
>>> 70GB. We serve 100k+ requests per day.
    >>>
>>>
>>> Because of heavy indexing & deletion, we optimise solr instance
>>> everyday,
>>> because of that our solr cloud getting unstable , every solr instance go
>>> on
>>> recovery mode & our search is getting affected & very slow because of
>>> that.
>>> Optimisation takes around 1hr 30minutes.
>>> We are not able fix this issue, please help.
>>>
>>> Thanks & Regards
>>>
>>>
>>>
>>> --
>>> Sent from: http://lucene.472066.n3.nabble.com/Solr-User-f472068.html





--
Sent from: http://lucene.472066.n3.nabble.com/Solr-User-f472068.html



Re: Urgent help on solr optimisation issue !!

2019-06-07 Thread jena
Thanks @Michael Joyner,  how did you decide initialmax segment to 256 ? Or it
is some random number i can use for my case ? Can you guuide me how to
decide the initial & final max segments ?

 
Michael Joyner wrote
> That is the way we do it here - also helps a lot with not needing x2 or 
> x3 disk space to handle the merge:
> 
> public void solrOptimize() {
>          int initialMaxSegments = 256;
>          int finalMaxSegments = 4;
>          if (isShowSegmentCounter()) {
>              log.info("Optimizing ...");
>          }
>          try (SolrClient solrServerInstance = getSolrClientInstance()) {
>              for (int segments = initialMaxSegments; segments >= 
> finalMaxSegments; segments--) {
>                  if (isShowSegmentCounter()) {
>                      System.out.println("Optimizing to a max of " + 
> segments + " segments.");
>                  }
>                  try {
>                      solrServerInstance.optimize(true, true, segments);
>                  } catch (RemoteSolrException | SolrServerException | 
> IOException e) {
>                      log.severe(e.getMessage());
>                  }
>              }
>          } catch (IOException e) {
>              throw new RuntimeException(e);
>          }
>      }
> 
> On 6/7/19 4:56 AM, Nicolas Franck wrote:
>> In that case, hard optimisation like that is out the question.
>> Resort to automatic merge policies, specifying a maximum
>> amount of segments. Solr is created with multiple segments
>> in mind. Hard optimisation seems like not worth the problem.
>>
>> The problem is this: the less segments you specify during
>> during an optimisation, the longer it will take, because it has to read
>> all of these segments to be merged, and redo the sorting. And a cluster
>> has a lot of housekeeping on top of it.
>>
>> If you really want to issue a optimisation, then you can
>> also do it in steps (max segments parameter)
>>
>> 10 -> 9 -> 8 -> 7 .. -> 1
>>
>> that way less segments need to be merged in one go.
>>
>> testing your index will show you what a good maximum
>> amount of segments is for your index.
>>
>>> On 7 Jun 2019, at 07:27, jena 

> sthita2010@

>  wrote:
>>>
>>> Hello guys,
>>>
>>> We have 4 solr(version 4.4) instance on production environment, which
>>> are
>>> linked/associated with zookeeper for replication. We do heavy deleted &
>>> add
>>> operations. We have around 26million records and the index size is
>>> around
>>> 70GB. We serve 100k+ requests per day.
>>>
>>>
>>> Because of heavy indexing & deletion, we optimise solr instance
>>> everyday,
>>> because of that our solr cloud getting unstable , every solr instance go
>>> on
>>> recovery mode & our search is getting affected & very slow because of
>>> that.
>>> Optimisation takes around 1hr 30minutes.
>>> We are not able fix this issue, please help.
>>>
>>> Thanks & Regards
>>>
>>>
>>>
>>> --
>>> Sent from: http://lucene.472066.n3.nabble.com/Solr-User-f472068.html





--
Sent from: http://lucene.472066.n3.nabble.com/Solr-User-f472068.html


Re: Urgent help on solr optimisation issue !!

2019-06-07 Thread Erick Erickson



> On Jun 7, 2019, at 7:53 AM, David Santamauro  
> wrote:
> 
> So is this new optimize maxSegments / commit expungeDeletes behavior in 7.5? 
> My experience, and I watch the my optimize process very closely, is that 
> using maxSgements does not touch every segment with a deleted document. 
> expungeDeletes merges all segments that have deleted documents that have been 
> touched with said commit.
> 

Which part? 

The  different thing about 7.5 is that an optimize that doesn’t specify 
maxSegments will remove all deleted docs from an index without creating massive 
segments. Prior to 7.5 a simple optimize would create a single segment by 
default, no matter how large.

If, after the end of an optimize on a quiescent index, you see a difference 
between maxDoc and numDocs (or  deletedDocs  > 0) for a core, then that’s 
entirely unexpected  for any version of Solr.  NOTE: If you are actively 
indexing while optimizing you may see deleted docs in your index after optimize 
since optimize works on the segments it sees when the operation starts….

ExpungeDeletes has always, IIUC, defaulted to only merging segments  with > 10% 
deleted docs.

Best,
Erick

> After reading LUCENE-7976, it seems this is, indeed, new behavior.
> 
> 
> On 6/7/19, 10:31 AM, "Erick Erickson"  wrote:
> 
>Optimizing guarantees that there will be _no_ deleted documents in an 
> index when done. If a segment has even one deleted document, it’s merged, no 
> matter what you specify for maxSegments. 
> 
>Segments are write-once, so to remove deleted data from a segment it must 
> be at least rewritten into a new segment, whether or not it’s merged with 
> another segment on optimize.
> 
>expungeDeletes  does _not_ merge every segment that has deleted documents. 
> It merges segments that have > 10% (the default) deleted documents. If your 
> index happens to have all segments with > 10% deleted docs, then it will, 
> indeed, merge all of them.
> 
>In your example, if you look closely you should find that all segments 
> that had any deleted documents were written (merged) to new segments. I’d 
> expect that segments with _no_ deleted documents might mostly be left alone. 
> And two of the segments were chosen to merge together.
> 
>See LUCENE-7976 for a long discussion of how this changed starting  with 
> SOLR 7.5.
> 
>Best,
>Erick
> 
>> On Jun 7, 2019, at 7:07 AM, David Santamauro  
>> wrote:
>> 
>> Erick, on 6.0.1, optimize with maxSegments only merges down to the specified 
>> number. E.g., given an index with 75 segments, optimize with maxSegments=74 
>> will only merge 2 segments leaving 74 segments. It will choose a segment to 
>> merge that has deleted documents, but does not merge every segment with 
>> deleted documents.
>> 
>> I think you are thinking about the expungeDeletes parameter on the commit 
>> request. That will merge every segment that has a deleted document.
>> 
>> 
>> On 6/7/19, 10:00 AM, "Erick Erickson"  wrote:
>> 
>>   This isn’t quite right. Solr will rewrite _all_ segments that have _any_ 
>> deleted documents in them when optimizing, even one. Given your description, 
>> I’d guess that all your segments will have deleted documents, so even if you 
>> do specify maxSegments on the optimize command, the entire index will be 
>> rewritten.
>> 
>>   You’re in a bind, see: 
>> https://lucidworks.com/2017/10/13/segment-merging-deleted-documents-optimize-may-bad/.
>>  You have this one massive segment and it will _not_ be merged until it’s 
>> almost all deleted documents, see the link above for a fuller explanation.
>> 
>>   Prior to Solr 7.5 you don’t have many options except to re-index and _not_ 
>> optimize. So if possible I’d reindex from scratch into a new collection and 
>> do not optimize. Or restructure your process such that you can optimize in a 
>> quiet period when little indexing is going on.
>> 
>>   Best,
>>   Erick
>> 
>>> On Jun 7, 2019, at 2:51 AM, jena  wrote:
>>> 
>>> Thanks @Nicolas Franck for reply, i don't see any any segment info for 4.4
>>> version. Is there any API i can use to get my segment information ? Will try
>>> to use maxSegments and see if it can help us during optimization.
>>> 
>>> 
>>> 
>>> --
>>> Sent from: http://lucene.472066.n3.nabble.com/Solr-User-f472068.html
>> 
>> 
> 
> 



Re: Urgent help on solr optimisation issue !!

2019-06-07 Thread David Santamauro
So is this new optimize maxSegments / commit expungeDeletes behavior in 7.5? My 
experience, and I watch the my optimize process very closely, is that using 
maxSgements does not touch every segment with a deleted document. 
expungeDeletes merges all segments that have deleted documents that have been 
touched with said commit.

After reading LUCENE-7976, it seems this is, indeed, new behavior.


On 6/7/19, 10:31 AM, "Erick Erickson"  wrote:

Optimizing guarantees that there will be _no_ deleted documents in an index 
when done. If a segment has even one deleted document, it’s merged, no matter 
what you specify for maxSegments. 

Segments are write-once, so to remove deleted data from a segment it must 
be at least rewritten into a new segment, whether or not it’s merged with 
another segment on optimize.

expungeDeletes  does _not_ merge every segment that has deleted documents. 
It merges segments that have > 10% (the default) deleted documents. If your 
index happens to have all segments with > 10% deleted docs, then it will, 
indeed, merge all of them.

In your example, if you look closely you should find that all segments that 
had any deleted documents were written (merged) to new segments. I’d expect 
that segments with _no_ deleted documents might mostly be left alone. And two 
of the segments were chosen to merge together.

See LUCENE-7976 for a long discussion of how this changed starting  with 
SOLR 7.5.

Best,
Erick

> On Jun 7, 2019, at 7:07 AM, David Santamauro  
wrote:
> 
> Erick, on 6.0.1, optimize with maxSegments only merges down to the 
specified number. E.g., given an index with 75 segments, optimize with 
maxSegments=74 will only merge 2 segments leaving 74 segments. It will choose a 
segment to merge that has deleted documents, but does not merge every segment 
with deleted documents.
> 
> I think you are thinking about the expungeDeletes parameter on the commit 
request. That will merge every segment that has a deleted document.
> 
> 
> On 6/7/19, 10:00 AM, "Erick Erickson"  wrote:
> 
>This isn’t quite right. Solr will rewrite _all_ segments that have 
_any_ deleted documents in them when optimizing, even one. Given your 
description, I’d guess that all your segments will have deleted documents, so 
even if you do specify maxSegments on the optimize command, the entire index 
will be rewritten.
> 
>You’re in a bind, see: 
https://lucidworks.com/2017/10/13/segment-merging-deleted-documents-optimize-may-bad/.
 You have this one massive segment and it will _not_ be merged until it’s 
almost all deleted documents, see the link above for a fuller explanation.
> 
>Prior to Solr 7.5 you don’t have many options except to re-index and 
_not_ optimize. So if possible I’d reindex from scratch into a new collection 
and do not optimize. Or restructure your process such that you can optimize in 
a quiet period when little indexing is going on.
> 
>Best,
>Erick
> 
>> On Jun 7, 2019, at 2:51 AM, jena  wrote:
>> 
>> Thanks @Nicolas Franck for reply, i don't see any any segment info for 
4.4
>> version. Is there any API i can use to get my segment information ? Will 
try
>> to use maxSegments and see if it can help us during optimization.
>> 
>> 
>> 
>> --
>> Sent from: http://lucene.472066.n3.nabble.com/Solr-User-f472068.html
> 
> 




Re: Urgent help on solr optimisation issue !!

2019-06-07 Thread Erick Erickson
Optimizing guarantees that there will be _no_ deleted documents in an index 
when done. If a segment has even one deleted document, it’s merged, no matter 
what you specify for maxSegments. 

Segments are write-once, so to remove deleted data from a segment it must be at 
least rewritten into a new segment, whether or not it’s merged with another 
segment on optimize.

expungeDeletes  does _not_ merge every segment that has deleted documents. It 
merges segments that have > 10% (the default) deleted documents. If your index 
happens to have all segments with > 10% deleted docs, then it will, indeed, 
merge all of them.

In your example, if you look closely you should find that all segments that had 
any deleted documents were written (merged) to new segments. I’d expect that 
segments with _no_ deleted documents might mostly be left alone. And two of the 
segments were chosen to merge together.

See LUCENE-7976 for a long discussion of how this changed starting  with SOLR 
7.5.

Best,
Erick

> On Jun 7, 2019, at 7:07 AM, David Santamauro  
> wrote:
> 
> Erick, on 6.0.1, optimize with maxSegments only merges down to the specified 
> number. E.g., given an index with 75 segments, optimize with maxSegments=74 
> will only merge 2 segments leaving 74 segments. It will choose a segment to 
> merge that has deleted documents, but does not merge every segment with 
> deleted documents.
> 
> I think you are thinking about the expungeDeletes parameter on the commit 
> request. That will merge every segment that has a deleted document.
> 
> 
> On 6/7/19, 10:00 AM, "Erick Erickson"  wrote:
> 
>This isn’t quite right. Solr will rewrite _all_ segments that have _any_ 
> deleted documents in them when optimizing, even one. Given your description, 
> I’d guess that all your segments will have deleted documents, so even if you 
> do specify maxSegments on the optimize command, the entire index will be 
> rewritten.
> 
>You’re in a bind, see: 
> https://lucidworks.com/2017/10/13/segment-merging-deleted-documents-optimize-may-bad/.
>  You have this one massive segment and it will _not_ be merged until it’s 
> almost all deleted documents, see the link above for a fuller explanation.
> 
>Prior to Solr 7.5 you don’t have many options except to re-index and _not_ 
> optimize. So if possible I’d reindex from scratch into a new collection and 
> do not optimize. Or restructure your process such that you can optimize in a 
> quiet period when little indexing is going on.
> 
>Best,
>Erick
> 
>> On Jun 7, 2019, at 2:51 AM, jena  wrote:
>> 
>> Thanks @Nicolas Franck for reply, i don't see any any segment info for 4.4
>> version. Is there any API i can use to get my segment information ? Will try
>> to use maxSegments and see if it can help us during optimization.
>> 
>> 
>> 
>> --
>> Sent from: http://lucene.472066.n3.nabble.com/Solr-User-f472068.html
> 
> 



Re: Urgent help on solr optimisation issue !!

2019-06-07 Thread jena
Thanks @Erick for the suggestions. That looks so bad, yes your assumptions
are right, we have lot of delete & index documents as well. 



--
Sent from: http://lucene.472066.n3.nabble.com/Solr-User-f472068.html


Re: Urgent help on solr optimisation issue !!

2019-06-07 Thread jena
Thanks Shawn for suggestions. Interesting to know deleteByQuery has some
impact, will try to change it as you have suggested. Thabks



--
Sent from: http://lucene.472066.n3.nabble.com/Solr-User-f472068.html


Re: Urgent help on solr optimisation issue !!

2019-06-07 Thread David Santamauro

/clarification/ ... expungeDeletes will merge every segment *touched by the 
current commit* that has a deleted document.


On 6/7/19, 10:07 AM, "David Santamauro"  wrote:

Erick, on 6.0.1, optimize with maxSegments only merges down to the 
specified number. E.g., given an index with 75 segments, optimize with 
maxSegments=74 will only merge 2 segments leaving 74 segments. It will choose a 
segment to merge that has deleted documents, but does not merge every segment 
with deleted documents.

I think you are thinking about the expungeDeletes parameter on the commit 
request. That will merge every segment that has a deleted document.


On 6/7/19, 10:00 AM, "Erick Erickson"  wrote:

This isn’t quite right. Solr will rewrite _all_ segments that have 
_any_ deleted documents in them when optimizing, even one. Given your 
description, I’d guess that all your segments will have deleted documents, so 
even if you do specify maxSegments on the optimize command, the entire index 
will be rewritten.

You’re in a bind, see: 
https://lucidworks.com/2017/10/13/segment-merging-deleted-documents-optimize-may-bad/.
 You have this one massive segment and it will _not_ be merged until it’s 
almost all deleted documents, see the link above for a fuller explanation.

Prior to Solr 7.5 you don’t have many options except to re-index and 
_not_ optimize. So if possible I’d reindex from scratch into a new collection 
and do not optimize. Or restructure your process such that you can optimize in 
a quiet period when little indexing is going on.

Best,
Erick

> On Jun 7, 2019, at 2:51 AM, jena  wrote:
> 
> Thanks @Nicolas Franck for reply, i don't see any any segment info 
for 4.4
> version. Is there any API i can use to get my segment information ? 
Will try
> to use maxSegments and see if it can help us during optimization.
> 
> 
> 
> --
> Sent from: http://lucene.472066.n3.nabble.com/Solr-User-f472068.html





Re: Urgent help on solr optimisation issue !!

2019-06-07 Thread David Santamauro
Erick, on 6.0.1, optimize with maxSegments only merges down to the specified 
number. E.g., given an index with 75 segments, optimize with maxSegments=74 
will only merge 2 segments leaving 74 segments. It will choose a segment to 
merge that has deleted documents, but does not merge every segment with deleted 
documents.

I think you are thinking about the expungeDeletes parameter on the commit 
request. That will merge every segment that has a deleted document.


On 6/7/19, 10:00 AM, "Erick Erickson"  wrote:

This isn’t quite right. Solr will rewrite _all_ segments that have _any_ 
deleted documents in them when optimizing, even one. Given your description, 
I’d guess that all your segments will have deleted documents, so even if you do 
specify maxSegments on the optimize command, the entire index will be rewritten.

You’re in a bind, see: 
https://lucidworks.com/2017/10/13/segment-merging-deleted-documents-optimize-may-bad/.
 You have this one massive segment and it will _not_ be merged until it’s 
almost all deleted documents, see the link above for a fuller explanation.

Prior to Solr 7.5 you don’t have many options except to re-index and _not_ 
optimize. So if possible I’d reindex from scratch into a new collection and do 
not optimize. Or restructure your process such that you can optimize in a quiet 
period when little indexing is going on.

Best,
Erick

> On Jun 7, 2019, at 2:51 AM, jena  wrote:
> 
> Thanks @Nicolas Franck for reply, i don't see any any segment info for 4.4
> version. Is there any API i can use to get my segment information ? Will 
try
> to use maxSegments and see if it can help us during optimization.
> 
> 
> 
> --
> Sent from: http://lucene.472066.n3.nabble.com/Solr-User-f472068.html




Re: Urgent help on solr optimisation issue !!

2019-06-07 Thread Michael Joyner
That is the way we do it here - also helps a lot with not needing x2 or 
x3 disk space to handle the merge:


public void solrOptimize() {
        int initialMaxSegments = 256;
        int finalMaxSegments = 4;
        if (isShowSegmentCounter()) {
            log.info("Optimizing ...");
        }
        try (SolrClient solrServerInstance = getSolrClientInstance()) {
            for (int segments = initialMaxSegments; segments >= 
finalMaxSegments; segments--) {

                if (isShowSegmentCounter()) {
                    System.out.println("Optimizing to a max of " + 
segments + " segments.");

                }
                try {
                    solrServerInstance.optimize(true, true, segments);
                } catch (RemoteSolrException | SolrServerException | 
IOException e) {

                    log.severe(e.getMessage());
                }
            }
        } catch (IOException e) {
            throw new RuntimeException(e);
        }
    }

On 6/7/19 4:56 AM, Nicolas Franck wrote:

In that case, hard optimisation like that is out the question.
Resort to automatic merge policies, specifying a maximum
amount of segments. Solr is created with multiple segments
in mind. Hard optimisation seems like not worth the problem.

The problem is this: the less segments you specify during
during an optimisation, the longer it will take, because it has to read
all of these segments to be merged, and redo the sorting. And a cluster
has a lot of housekeeping on top of it.

If you really want to issue a optimisation, then you can
also do it in steps (max segments parameter)

10 -> 9 -> 8 -> 7 .. -> 1

that way less segments need to be merged in one go.

testing your index will show you what a good maximum
amount of segments is for your index.


On 7 Jun 2019, at 07:27, jena  wrote:

Hello guys,

We have 4 solr(version 4.4) instance on production environment, which are
linked/associated with zookeeper for replication. We do heavy deleted & add
operations. We have around 26million records and the index size is around
70GB. We serve 100k+ requests per day.


Because of heavy indexing & deletion, we optimise solr instance everyday,
because of that our solr cloud getting unstable , every solr instance go on
recovery mode & our search is getting affected & very slow because of that.
Optimisation takes around 1hr 30minutes.
We are not able fix this issue, please help.

Thanks & Regards



--
Sent from: http://lucene.472066.n3.nabble.com/Solr-User-f472068.html




Re: Urgent help on solr optimisation issue !!

2019-06-07 Thread Erick Erickson
This isn’t quite right. Solr will rewrite _all_ segments that have _any_ 
deleted documents in them when optimizing, even one. Given your description, 
I’d guess that all your segments will have deleted documents, so even if you do 
specify maxSegments on the optimize command, the entire index will be rewritten.

You’re in a bind, see: 
https://lucidworks.com/2017/10/13/segment-merging-deleted-documents-optimize-may-bad/.
 You have this one massive segment and it will _not_ be merged until it’s 
almost all deleted documents, see the link above for a fuller explanation.

Prior to Solr 7.5 you don’t have many options except to re-index and _not_ 
optimize. So if possible I’d reindex from scratch into a new collection and do 
not optimize. Or restructure your process such that you can optimize in a quiet 
period when little indexing is going on.

Best,
Erick

> On Jun 7, 2019, at 2:51 AM, jena  wrote:
> 
> Thanks @Nicolas Franck for reply, i don't see any any segment info for 4.4
> version. Is there any API i can use to get my segment information ? Will try
> to use maxSegments and see if it can help us during optimization.
> 
> 
> 
> --
> Sent from: http://lucene.472066.n3.nabble.com/Solr-User-f472068.html



Re: Urgent help on solr optimisation issue !!

2019-06-07 Thread Shawn Heisey

On 6/6/2019 11:27 PM, jena wrote:

Because of heavy indexing & deletion, we optimise solr instance everyday,
because of that our solr cloud getting unstable , every solr instance go on
recovery mode & our search is getting affected & very slow because of that.
Optimisation takes around 1hr 30minutes.


Ordinarily, optimizing would just be a transparent operation and even 
though it's slow, wouldn't be something that would interfere with index 
operation.


But if you add deleteByQuery to the mix, then you WILL have problems. 
These problems can occur even if you don't optimize -- because sometimes 
the normal segment merges will take a very long time like an optimize, 
and the same interference between deleteByQuery and segment merging will 
happen.


The fix for that is to stop doing deleteByQuery.  Replace it with a two 
step operation where you first do the query to get ID values, and then 
do deleteById.  That kind of delete will not have any bad interaction 
with segment merging.


Thanks,
Shawn


Re: Urgent help on solr optimisation issue !!

2019-06-07 Thread jena
Thanks @Nicolas Franck for reply, i don't see any any segment info for 4.4
version. Is there any API i can use to get my segment information ? Will try
to use maxSegments and see if it can help us during optimization.



--
Sent from: http://lucene.472066.n3.nabble.com/Solr-User-f472068.html


Re: Urgent help on solr optimisation issue !!

2019-06-07 Thread Nicolas Franck
In that case, hard optimisation like that is out the question.
Resort to automatic merge policies, specifying a maximum
amount of segments. Solr is created with multiple segments
in mind. Hard optimisation seems like not worth the problem.

The problem is this: the less segments you specify during
during an optimisation, the longer it will take, because it has to read
all of these segments to be merged, and redo the sorting. And a cluster
has a lot of housekeeping on top of it.

If you really want to issue a optimisation, then you can
also do it in steps (max segments parameter)

10 -> 9 -> 8 -> 7 .. -> 1

that way less segments need to be merged in one go.

testing your index will show you what a good maximum
amount of segments is for your index.

> On 7 Jun 2019, at 07:27, jena  wrote:
> 
> Hello guys,
> 
> We have 4 solr(version 4.4) instance on production environment, which are
> linked/associated with zookeeper for replication. We do heavy deleted & add
> operations. We have around 26million records and the index size is around
> 70GB. We serve 100k+ requests per day.
> 
> 
> Because of heavy indexing & deletion, we optimise solr instance everyday,
> because of that our solr cloud getting unstable , every solr instance go on
> recovery mode & our search is getting affected & very slow because of that.
> Optimisation takes around 1hr 30minutes. 
> We are not able fix this issue, please help.
> 
> Thanks & Regards
> 
> 
> 
> --
> Sent from: http://lucene.472066.n3.nabble.com/Solr-User-f472068.html



Urgent help on solr optimisation issue !!

2019-06-07 Thread jena
Hello guys,

We have 4 solr(version 4.4) instance on production environment, which are
linked/associated with zookeeper for replication. We do heavy deleted & add
operations. We have around 26million records and the index size is around
70GB. We serve 100k+ requests per day.


Because of heavy indexing & deletion, we optimise solr instance everyday,
because of that our solr cloud getting unstable , every solr instance go on
recovery mode & our search is getting affected & very slow because of that.
Optimisation takes around 1hr 30minutes. 
We are not able fix this issue, please help.

Thanks & Regards



--
Sent from: http://lucene.472066.n3.nabble.com/Solr-User-f472068.html


Re: Please help on pdate type during indexing

2019-06-03 Thread Shawn Heisey

On 6/2/2019 11:34 PM, derrick cui wrote:

I spent whole day to indexing my data to solr(8.0), but there is one field 
which type is pdate always failed.
error adding field 
'UpdateDate'='org.apache.solr.common.SolrInputField:UpdateDate=2019-06-03T05:22:14.842Z'
 msg=Invalid Date in Date Math 
String:'org.apache.solr.common.SolrInputField:UpdateDate=2019-06-03T05:22:14.842Z',,
 retry=0 commError=false errorCode=400


If that whole string (including the org.apache.solr stuff) is the input, 
it will fail.  The value will need to be the ISO date format, starting 
with the year, including the T, and ending with Z.



I have put timezone in the string, please help,


As far as I know, you can't do that.  Solr only handles dates in UTC - 
no timezone.


There is one place where Solr has a timezone configurable -- that's for 
date math.  So that Solr knows what time a day starts when it is doing 
NOW/DAY, NOW/WEEK, etc.  Other than that, everything is ALWAYS in UTC.


Thanks,
Shawn


Please help on pdate type during indexing

2019-06-02 Thread derrick cui
Hi all,
I spent whole day to indexing my data to solr(8.0), but there is one field 
which type is pdate always failed. 
error adding field 
'UpdateDate'='org.apache.solr.common.SolrInputField:UpdateDate=2019-06-03T05:22:14.842Z'
 msg=Invalid Date in Date Math 
String:'org.apache.solr.common.SolrInputField:UpdateDate=2019-06-03T05:22:14.842Z',,
 retry=0 commError=false errorCode=400 

I have put timezone in the string, please help,
thanks

Re: Accessing Solr collections at different ports - Need help

2019-05-03 Thread Shawn Heisey

On 5/3/2019 12:52 AM, Salmaan Rashid Syed wrote:

I say that the nodes are limited to 4 because when I launch Solr in cloud
mode, the first prompt that I get is to choose number of nodes [1-4]. When
I tried to enter 7, it says that they are more than 4 and choose a smaller
number.


That's the cloud *EXAMPLE*.  It sets everything up on one server that 
would normally be on separate servers, and runs an embedded zookeeper in 
the first node.


Example setups are not meant for production.

Thanks,
Shawn


Help extracting text from PDF images when indexing files

2019-05-03 Thread Miguel Fernandes
Hi all,

I'm new to Solr, i've recently downloaded solr 8.0.0 and have been
following the tutorials. Using the 2 example instances created, i'm trying
to create my own collection. I've done a copy of the _default configset and
used it to create my collection.

For my case, the files i want to index are pdf files composed of images. I
have tesseract installed and i can parse correctly the pdf files using an
tika server instance i downloaded, i.e i can get the extracted text from
the images.

I'm following the instructions on from page "Uploading Data with Solr Cell
Using Apache Tika" to propertly configure the PDF image extraction but i'm
not being able to correctly get this. My aim is that the content of the PDF
file goes into a field named content that i've created in my schema. From
my attempts this field is non existent or when it exists it doesnt contain
the expected text from the parsed images.

In the configuration of ExtractingRequestHandler, the lib clauses are
present in my solrconfig.xml, that section is as below:

  

  true
  content

parseContext.xml
  

And my parseContext.xml file is:





    


Any help on how to correctly extract the text from the PDF images would be
great.
Thanks
Miguel


Re: Accessing Solr collections at different ports - Need help

2019-05-03 Thread Jörn Franke
This is just the setup for an experimental cluster (generally it does also not 
make sense to have many instances on the same server). Once you have got more 
experience take a look at 
https://lucene.apache.org/solr/guide/7_7/taking-solr-to-production.html

To see how to set up clusters.

> Am 03.05.2019 um 08:52 schrieb Salmaan Rashid Syed 
> :
> 
> Thanks Jorn for your reply.
> 
> I say that the nodes are limited to 4 because when I launch Solr in cloud
> mode, the first prompt that I get is to choose number of nodes [1-4]. When
> I tried to enter 7, it says that they are more than 4 and choose a smaller
> number.
> 
> 
> *Thanks and Regards,*
> Salmaan Rashid Syed
> +91 8978353445 | www.panna.ai |
> 5550 Granite Pkwy, Suite #225, Plano TX-75024.
> Cyber Gateways, Hi-tech City, Hyderabad, Telangana, India.
> 
> 
> 
>> On Fri, May 3, 2019 at 12:05 PM Jörn Franke  wrote:
>> 
>> BTW why do you think that SolrCloud is limited to 4 nodes? More are for
>> sure possible.
>> 
>>> Am 03.05.2019 um 07:54 schrieb Salmaan Rashid Syed <
>> salmaan.ras...@mroads.com>:
>>> 
>>> Hi Solr Users,
>>> 
>>> I am using Solr 7.6 in cloud mode with external zookeeper installed at
>>> ports 2181, 2182, 2183. Currently we have only one server allocated for
>>> Solr. We are planning to move to multiple servers for better sharing,
>>> replication etc in near future.
>>> 
>>> Now the issue is that, our organisation has data indexed for different
>>> clients as separate collections. We want to uniquely access, update and
>>> index each collection separately so that each individual client has
>> access
>>> to their respective collections at their respective ports. Eg:—
>> Collection1
>>> at port 8983, Collection2 at port 8984, Collection3 at port 8985 etc.
>>> 
>>> I have two options I guess, one is to run Solr in cloud mode with 4 nodes
>>> (max as limited by Solr) at 4 different ports. I don’t know how to go
>>> beyond 4 nodes/ports in this case.
>>> 
>>> The other option is to run Solr as service and create multiple copies of
>>> Solr folder within the Server folder and access each Solr at different
>> port
>>> with its own collection as shown by
>>> https://www.youtube.com/watch?v=wmQFwK2sujE
>>> 
>>> I am really confused as to which is the better path to choose. Please
>> help
>>> me out.
>>> 
>>> Thanks.
>>> 
>>> Regards,
>>> Salmaan
>>> 
>>> 
>>> *Thanks and Regards,*
>>> Salmaan Rashid Syed
>>> +91 8978353445 | www.panna.ai |
>>> 5550 Granite Pkwy, Suite #225, Plano TX-75024.
>>> Cyber Gateways, Hi-tech City, Hyderabad, Telangana, India.
>> 


Re: Accessing Solr collections at different ports - Need help

2019-05-03 Thread Salmaan Rashid Syed
Thanks Jorn for your reply.

I say that the nodes are limited to 4 because when I launch Solr in cloud
mode, the first prompt that I get is to choose number of nodes [1-4]. When
I tried to enter 7, it says that they are more than 4 and choose a smaller
number.


*Thanks and Regards,*
Salmaan Rashid Syed
+91 8978353445 | www.panna.ai |
5550 Granite Pkwy, Suite #225, Plano TX-75024.
Cyber Gateways, Hi-tech City, Hyderabad, Telangana, India.



On Fri, May 3, 2019 at 12:05 PM Jörn Franke  wrote:

> BTW why do you think that SolrCloud is limited to 4 nodes? More are for
> sure possible.
>
> > Am 03.05.2019 um 07:54 schrieb Salmaan Rashid Syed <
> salmaan.ras...@mroads.com>:
> >
> > Hi Solr Users,
> >
> > I am using Solr 7.6 in cloud mode with external zookeeper installed at
> > ports 2181, 2182, 2183. Currently we have only one server allocated for
> > Solr. We are planning to move to multiple servers for better sharing,
> > replication etc in near future.
> >
> > Now the issue is that, our organisation has data indexed for different
> > clients as separate collections. We want to uniquely access, update and
> > index each collection separately so that each individual client has
> access
> > to their respective collections at their respective ports. Eg:—
> Collection1
> > at port 8983, Collection2 at port 8984, Collection3 at port 8985 etc.
> >
> > I have two options I guess, one is to run Solr in cloud mode with 4 nodes
> > (max as limited by Solr) at 4 different ports. I don’t know how to go
> > beyond 4 nodes/ports in this case.
> >
> > The other option is to run Solr as service and create multiple copies of
> > Solr folder within the Server folder and access each Solr at different
> port
> > with its own collection as shown by
> > https://www.youtube.com/watch?v=wmQFwK2sujE
> >
> > I am really confused as to which is the better path to choose. Please
> help
> > me out.
> >
> > Thanks.
> >
> > Regards,
> > Salmaan
> >
> >
> > *Thanks and Regards,*
> > Salmaan Rashid Syed
> > +91 8978353445 | www.panna.ai |
> > 5550 Granite Pkwy, Suite #225, Plano TX-75024.
> > Cyber Gateways, Hi-tech City, Hyderabad, Telangana, India.
>


Re: Accessing Solr collections at different ports - Need help

2019-05-03 Thread Salmaan Rashid Syed
Thanks Walter,

Since I am new to Solr and by looking at your suggestion, it looks like I
am trying to do something very complicated and out-of-box capabilities of
Solr. I really don't want to do that.

I am not from Computer Science background and my specialisation is in
Analytics and AI.

Let me put my case scenario briefly.

We have developed a customised Solr-search engine that can search for data
(prepared, cleaned and preprocessed by us) in each individual Solr
collection.

Every client of ours is from a different vertical (like health,
engineering, public services, finance, casual works etc). They search for
data in their respective Solr collection. They also add, update and
re-index their respective data periodically.

As suggested by you, if I port-out all the collections from a single port,
will not the latency increase, wil not the burden on a single server
increase, will not the computational speed slows down as all the clients
are trying to speak to the same port simultaneously.

Or do you think that Solr-as-service is better option, where I can create
multiple Solr instances at different ports with collections of individual
clients in each solr instance.

To be honest, I really don't know what Solr-as-service is really trying to
accomplish.

Apologies for lengthy question and Thanks in advance.


*Thanks and Regards,*
Salmaan Rashid Syed
+91 8978353445 | www.panna.ai |
5550 Granite Pkwy, Suite #225, Plano TX-75024.
Cyber Gateways, Hi-tech City, Hyderabad, Telangana, India.



On Fri, May 3, 2019 at 11:59 AM Walter Underwood 
wrote:

> The best option is to run all the collections at the same port.
> Intra-cluster communication cannot be split over multiple ports, so this
> would require big internal changes to Solr. And what about communication
> that does not belong to a collection, like electing an overseer node?
>
> Why do you want the very non-standard configuration?
>
> If you must have it, run a webserver like nginx on each node, configure it
> to do this crazy multiple port thing for external traffic and to forward
> all traffic to Solr’s single port.
>
> wunder
> Walter Underwood
> wun...@wunderwood.org
> http://observer.wunderwood.org/  (my blog)
>
> > On May 3, 2019, at 7:54 AM, Salmaan Rashid Syed <
> salmaan.ras...@mroads.com> wrote:
> >
> > Hi Solr Users,
> >
> > I am using Solr 7.6 in cloud mode with external zookeeper installed at
> > ports 2181, 2182, 2183. Currently we have only one server allocated for
> > Solr. We are planning to move to multiple servers for better sharing,
> > replication etc in near future.
> >
> > Now the issue is that, our organisation has data indexed for different
> > clients as separate collections. We want to uniquely access, update and
> > index each collection separately so that each individual client has
> access
> > to their respective collections at their respective ports. Eg:—
> Collection1
> > at port 8983, Collection2 at port 8984, Collection3 at port 8985 etc.
> >
> > I have two options I guess, one is to run Solr in cloud mode with 4 nodes
> > (max as limited by Solr) at 4 different ports. I don’t know how to go
> > beyond 4 nodes/ports in this case.
> >
> > The other option is to run Solr as service and create multiple copies of
> > Solr folder within the Server folder and access each Solr at different
> port
> > with its own collection as shown by
> > https://www.youtube.com/watch?v=wmQFwK2sujE
> >
> > I am really confused as to which is the better path to choose. Please
> help
> > me out.
> >
> > Thanks.
> >
> > Regards,
> > Salmaan
> >
> >
> > *Thanks and Regards,*
> > Salmaan Rashid Syed
> > +91 8978353445 | www.panna.ai |
> > 5550 Granite Pkwy, Suite #225, Plano TX-75024.
> > Cyber Gateways, Hi-tech City, Hyderabad, Telangana, India.
>
>


Re: Accessing Solr collections at different ports - Need help

2019-05-03 Thread Jörn Franke
BTW why do you think that SolrCloud is limited to 4 nodes? More are for sure 
possible.

> Am 03.05.2019 um 07:54 schrieb Salmaan Rashid Syed 
> :
> 
> Hi Solr Users,
> 
> I am using Solr 7.6 in cloud mode with external zookeeper installed at
> ports 2181, 2182, 2183. Currently we have only one server allocated for
> Solr. We are planning to move to multiple servers for better sharing,
> replication etc in near future.
> 
> Now the issue is that, our organisation has data indexed for different
> clients as separate collections. We want to uniquely access, update and
> index each collection separately so that each individual client has access
> to their respective collections at their respective ports. Eg:— Collection1
> at port 8983, Collection2 at port 8984, Collection3 at port 8985 etc.
> 
> I have two options I guess, one is to run Solr in cloud mode with 4 nodes
> (max as limited by Solr) at 4 different ports. I don’t know how to go
> beyond 4 nodes/ports in this case.
> 
> The other option is to run Solr as service and create multiple copies of
> Solr folder within the Server folder and access each Solr at different port
> with its own collection as shown by
> https://www.youtube.com/watch?v=wmQFwK2sujE
> 
> I am really confused as to which is the better path to choose. Please help
> me out.
> 
> Thanks.
> 
> Regards,
> Salmaan
> 
> 
> *Thanks and Regards,*
> Salmaan Rashid Syed
> +91 8978353445 | www.panna.ai |
> 5550 Granite Pkwy, Suite #225, Plano TX-75024.
> Cyber Gateways, Hi-tech City, Hyderabad, Telangana, India.


Re: Accessing Solr collections at different ports - Need help

2019-05-03 Thread Jörn Franke
You can have dedicarse clusters per Client and/or you can protect it via 
Kerberos or Basic Auth or write your own authorization plugin based on OAuth.

I am not sure why you want to offer this on different ports to different 
clients.

> Am 03.05.2019 um 07:54 schrieb Salmaan Rashid Syed 
> :
> 
> Hi Solr Users,
> 
> I am using Solr 7.6 in cloud mode with external zookeeper installed at
> ports 2181, 2182, 2183. Currently we have only one server allocated for
> Solr. We are planning to move to multiple servers for better sharing,
> replication etc in near future.
> 
> Now the issue is that, our organisation has data indexed for different
> clients as separate collections. We want to uniquely access, update and
> index each collection separately so that each individual client has access
> to their respective collections at their respective ports. Eg:— Collection1
> at port 8983, Collection2 at port 8984, Collection3 at port 8985 etc.
> 
> I have two options I guess, one is to run Solr in cloud mode with 4 nodes
> (max as limited by Solr) at 4 different ports. I don’t know how to go
> beyond 4 nodes/ports in this case.
> 
> The other option is to run Solr as service and create multiple copies of
> Solr folder within the Server folder and access each Solr at different port
> with its own collection as shown by
> https://www.youtube.com/watch?v=wmQFwK2sujE
> 
> I am really confused as to which is the better path to choose. Please help
> me out.
> 
> Thanks.
> 
> Regards,
> Salmaan
> 
> 
> *Thanks and Regards,*
> Salmaan Rashid Syed
> +91 8978353445 | www.panna.ai |
> 5550 Granite Pkwy, Suite #225, Plano TX-75024.
> Cyber Gateways, Hi-tech City, Hyderabad, Telangana, India.


Re: Accessing Solr collections at different ports - Need help

2019-05-03 Thread Walter Underwood
The best option is to run all the collections at the same port. Intra-cluster 
communication cannot be split over multiple ports, so this would require big 
internal changes to Solr. And what about communication that does not belong to 
a collection, like electing an overseer node?

Why do you want the very non-standard configuration?

If you must have it, run a webserver like nginx on each node, configure it to 
do this crazy multiple port thing for external traffic and to forward all 
traffic to Solr’s single port.

wunder
Walter Underwood
wun...@wunderwood.org
http://observer.wunderwood.org/  (my blog)

> On May 3, 2019, at 7:54 AM, Salmaan Rashid Syed  
> wrote:
> 
> Hi Solr Users,
> 
> I am using Solr 7.6 in cloud mode with external zookeeper installed at
> ports 2181, 2182, 2183. Currently we have only one server allocated for
> Solr. We are planning to move to multiple servers for better sharing,
> replication etc in near future.
> 
> Now the issue is that, our organisation has data indexed for different
> clients as separate collections. We want to uniquely access, update and
> index each collection separately so that each individual client has access
> to their respective collections at their respective ports. Eg:— Collection1
> at port 8983, Collection2 at port 8984, Collection3 at port 8985 etc.
> 
> I have two options I guess, one is to run Solr in cloud mode with 4 nodes
> (max as limited by Solr) at 4 different ports. I don’t know how to go
> beyond 4 nodes/ports in this case.
> 
> The other option is to run Solr as service and create multiple copies of
> Solr folder within the Server folder and access each Solr at different port
> with its own collection as shown by
> https://www.youtube.com/watch?v=wmQFwK2sujE
> 
> I am really confused as to which is the better path to choose. Please help
> me out.
> 
> Thanks.
> 
> Regards,
> Salmaan
> 
> 
> *Thanks and Regards,*
> Salmaan Rashid Syed
> +91 8978353445 | www.panna.ai |
> 5550 Granite Pkwy, Suite #225, Plano TX-75024.
> Cyber Gateways, Hi-tech City, Hyderabad, Telangana, India.



Accessing Solr collections at different ports - Need help

2019-05-02 Thread Salmaan Rashid Syed
Hi Solr Users,

I am using Solr 7.6 in cloud mode with external zookeeper installed at
ports 2181, 2182, 2183. Currently we have only one server allocated for
Solr. We are planning to move to multiple servers for better sharing,
replication etc in near future.

Now the issue is that, our organisation has data indexed for different
clients as separate collections. We want to uniquely access, update and
index each collection separately so that each individual client has access
to their respective collections at their respective ports. Eg:— Collection1
at port 8983, Collection2 at port 8984, Collection3 at port 8985 etc.

I have two options I guess, one is to run Solr in cloud mode with 4 nodes
(max as limited by Solr) at 4 different ports. I don’t know how to go
beyond 4 nodes/ports in this case.

The other option is to run Solr as service and create multiple copies of
Solr folder within the Server folder and access each Solr at different port
with its own collection as shown by
https://www.youtube.com/watch?v=wmQFwK2sujE

I am really confused as to which is the better path to choose. Please help
me out.

Thanks.

Regards,
Salmaan


*Thanks and Regards,*
Salmaan Rashid Syed
+91 8978353445 | www.panna.ai |
5550 Granite Pkwy, Suite #225, Plano TX-75024.
Cyber Gateways, Hi-tech City, Hyderabad, Telangana, India.


Re: Autosuggest help

2019-04-06 Thread Midas A
Any update?

On Thu, 4 Apr 2019, 1:09 pm Midas A,  wrote:

> Hi,
>
> We need to use auto suggest click stream data in Auto suggestion . How we
> can achieve this ?
>
> Currently we are using suggester for auto suggestions .
>
>
> Regards,
> Midas
>


Autosuggest help

2019-04-04 Thread Midas A
Hi,

We need to use auto suggest click stream data in Auto suggestion . How we
can achieve this ?

Currently we are using suggester for auto suggestions .


Regards,
Midas


Re: Help with slow retrieving data

2019-03-26 Thread Wendy2
Hi Eric,

Thank you for your response!  

On the old system, I changed to use docValues=true, and had better
performance. But the searcher was not warmed before I measured it. Also the
local disk was too small so I used an attached volume which turned out was a
big cause of the slow retrieve.

On the new system, I didn't use docValues=true, but used SSD, so the
retrieve was much much faster.

In both cases, the QTime were good. 

I will keep tuning the performance for sorting, facets, etc. 

Thanks and all the best! 



--
Sent from: http://lucene.472066.n3.nabble.com/Solr-User-f472068.html


Re: Help with slow retrieving data

2019-03-25 Thread Erick Erickson
Glad it’s working out for you. There are a couple of things here that bear a 
bit more investigation.

Using SSDs shouldn’t materially affect the response if:

1> the searcher is warmed. Before trying your query, execute a few queries like 
“q="some search that hits a log of docs"=myfield asc”

2> Your Solr instance isn't swapping.

What’s not making sense is that once docValues are read into memory, there is 
_no_ disk access necessary, assuming the DV structure for the field has not 
been swapped out.

Things that may be getting in the way:

- you are asking for _any_ fields to be returned that are not docValues

- you are not getting the docValues (useDocValuesAsStored=true)

- your Solr instance is swapping. DocValues data is kept in the OS memory 
space, see: 
http://blog.thetaphi.de/2012/07/use-lucenes-mmapdirectory-on-64bit.html

- you haven’t warmed up your searchers to read these values off disk before you 
measure.

Your results are in line with expectations, but can’t account for the 
difference between your old system and new. Perhaps when you re-indexed you had 
some docValues that weren’t before?

FWIW,
Erick

> On Mar 25, 2019, at 10:44 AM, Wendy2  wrote:
> 
> Hi Eric,
> 
> Thank you very much for your response! I tried 
> 
> "Try this: 
> 1> insure docValues=true for the field. You’ll have to re-index all your
> docs. "
> 
> I tried the above approach as you recommended, the performance was getting
> better, reduced about 3 seconds.
> 
> Then I tested on a new cloud server with local SSD for one core on Solr, the
> performance was great.
> With 5 rows to retrieve, the response time was 0.2s, which is better
> than our acceptance criteria :-)
> So happy.  Thank you!
> 
> =testing
> wget -O output.txt
> 'http://localhost:8983/solr/s_entry/select?fl=pdb_id,score=human=0=5'
> --2019-03-25 10:23:21-- 
> http://localhost:8983/solr/s_entry/select?fl=pdb_id,score=human=0=5
> Resolving localhost (localhost)... ::1, 127.0.0.1
> Connecting to localhost (localhost)|::1|:8983... connected.
> HTTP request sent, awaiting response... 200 OK
> Length: unspecified [application/json]
> Saving to: 'output.txt'
> 
> output.txt [
> <=>   
>   
>  
> ]   2.90M  16.1MB/sin 0.2s
> 
> 
> 
> 
> 
> --
> Sent from: http://lucene.472066.n3.nabble.com/Solr-User-f472068.html



Re: Help with slow retrieving data

2019-03-25 Thread Wendy2
Hi Eric,

Thank you very much for your response! I tried 

"Try this: 
1> insure docValues=true for the field. You’ll have to re-index all your
docs. "

I tried the above approach as you recommended, the performance was getting
better, reduced about 3 seconds.

Then I tested on a new cloud server with local SSD for one core on Solr, the
performance was great.
With 5 rows to retrieve, the response time was 0.2s, which is better
than our acceptance criteria :-)
So happy.  Thank you!

=testing
 wget -O output.txt
'http://localhost:8983/solr/s_entry/select?fl=pdb_id,score=human=0=5'
--2019-03-25 10:23:21-- 
http://localhost:8983/solr/s_entry/select?fl=pdb_id,score=human=0=5
Resolving localhost (localhost)... ::1, 127.0.0.1
Connecting to localhost (localhost)|::1|:8983... connected.
HTTP request sent, awaiting response... 200 OK
Length: unspecified [application/json]
Saving to: 'output.txt'

output.txt [
<=> 

 
]   2.90M  16.1MB/sin 0.2s





--
Sent from: http://lucene.472066.n3.nabble.com/Solr-User-f472068.html


Re: Help with slow retrieving data

2019-03-25 Thread Wendy2
Hi Eric,

Thank you very much for your response!

"Try this: 
1> insure docValues=true for the field. You’ll have to re-index all your
docs."

I tried use docValues and reduced about 3 seconds.  Now I am going to "try
2> if that doesn’t make much of a difference, try adding
useDocValuesAsStored for the field." and will report back.   Thanks! 
 



--
Sent from: http://lucene.472066.n3.nabble.com/Solr-User-f472068.html


Re: Help with slow retrieving data

2019-03-24 Thread Erick Erickson
If the fields you’re returning have to be pulled from the store=“true” parts of 
the index, then each value returned requires
1> a disk read
2> decompressing 16K minimum

which is what Shawn was getting at.

Try this:
1> insure docValues=true for the field. You’ll have to re-index all your docs.
2> if that doesn’t make much of a difference, try adding useDocValuesAsStored 
for the field.

You’ll have to make sure your index is warmed up to get good measurements.

And be aware that if you’re displaying this in a browser, the browser itself 
may be taking a long time to render the results. To eliminate that, try sending 
the query with curl or similar.

Finally, Solr was not designed to be efficient at returning large numbers of 
rows. You may well want to use streaming for that: 
https://lucene.apache.org/solr/guide/7_2/streaming-expressions.html

Best,
Erick

> On Mar 24, 2019, at 3:30 PM, Wendy2  wrote:
> 
> Hi Shawn,Thanks for your response.  I have several Solr cores on the same
> Solr instance. The particular core with slow retrieve response has 6 gb
> data. Sorry for the confusion.I restart Solr and ran same query with rows=0
> vs 1, QTime for both are OK, so I guess it is the retrieving slow? I
> also tried return different rows, the more rows, the longer retrieving time.
> The machine has 64G ram, I tried 32G for Solr Heap memory, but the
> performance didn't improve much.  Any suggestions?  Thank you very
> much!=Return 0 rows:232  --.-KB/sin 0s { 
> "responseHeader":{"status":0,"QTime":96,"params":{ 
> "q":"human",  "fl":"pdb_id,score",  "start":"0",  "rows":"0"}}, 
> "response":{"numFound":67428,"start":0,"maxScore":246.08528,"docs":[]  }}~
>   
> Return 1 rows: 584.46K  65.4KB/sin 8.9s {  "responseHeader":{   
> "status":0,"QTime":39,"params":{  "q":"human", 
> "fl":"pdb_id,score",  "start":"0",  "rows":"1"}}, 
> "response":{"numFound":67428,"start":0,"maxScore":246.08528,"docs":[  {
>  
> 
> 
> 
> --
> Sent from: http://lucene.472066.n3.nabble.com/Solr-User-f472068.html



Re: Help with slow retrieving data

2019-03-24 Thread Wendy2
Hi Shawn,Thanks for your response.  I have several Solr cores on the same
Solr instance. The particular core with slow retrieve response has 6 gb
data. Sorry for the confusion.I restart Solr and ran same query with rows=0
vs 1, QTime for both are OK, so I guess it is the retrieving slow? I
also tried return different rows, the more rows, the longer retrieving time.
The machine has 64G ram, I tried 32G for Solr Heap memory, but the
performance didn't improve much.  Any suggestions?  Thank you very
much!=Return 0 rows:232  --.-KB/sin 0s { 
"responseHeader":{"status":0,"QTime":96,"params":{ 
"q":"human",  "fl":"pdb_id,score",  "start":"0",  "rows":"0"}}, 
"response":{"numFound":67428,"start":0,"maxScore":246.08528,"docs":[]  }}~  
Return 1 rows: 584.46K  65.4KB/sin 8.9s {  "responseHeader":{   
"status":0,"QTime":39,"params":{  "q":"human", 
"fl":"pdb_id,score",  "start":"0",  "rows":"1"}}, 
"response":{"numFound":67428,"start":0,"maxScore":246.08528,"docs":[  {
 



--
Sent from: http://lucene.472066.n3.nabble.com/Solr-User-f472068.html

Re: Help with slow retrieving data

2019-03-24 Thread Shawn Heisey

On 3/24/2019 12:11 PM, Wendy2 wrote:

Thank you very much for your response! Here is a screen shot.  Is the CPU an
issue?


You said that your index is 6GB, but the process listing is saying that 
you have more than 30GB of index data being managed by Solr.  There's a 
discrepancy somewhere.


This listing appears to be sorted by CPU, not by resident memory as the 
instructions I pointed you at indicated.  I can't be sure whether or not 
something important is missing from the listing.  For now I am going to 
assume that I can see everything important.


What happens if you restart Solr or reload your core and then do the 
same query with rows=0?  Is that fast or slow?  If it is slow, then it 
is not retrieving the data that is slow, but the query itself.


Retrieving a large number of rows normally involves decompressing stored 
fields.  This will exercise the CPU.


It looks like you have a 4GB heap for Solr.  With over 30GB of index, 
it's entirely possible that 4GB of heap is not enough ... or it might be 
plenty.  It's not super easy to figure out exactly how much heap you 
need.  Usually it requires experimentation.


You are sharing this machine between Solr and mongodb.  Depending on how 
much data is in the mongo database, you might need to add more memory or 
split your services onto different machines.


Thanks,
Shawn


Re: Help with slow retrieving data

2019-03-24 Thread Wendy2
Hi Shawn,

Thank you very much for your response! Here is a screen shot.  Is the CPU an
issue?


 



--
Sent from: http://lucene.472066.n3.nabble.com/Solr-User-f472068.html


Re: Help with slow retrieving data

2019-03-24 Thread Shawn Heisey

On 3/24/2019 7:16 AM, Wendy2 wrote:

Hi Solr users:I use Solr 7.3.1 and 150,000 documents and about 6GB in total.
When I try to retrieve 2 ids (4 letter code, indexed and stored), it
took 17s to retrieve 1.14M size data. I tried to increase RAM and cache, but


Can you get the screenshot described here, share it with a file sharing 
site, and provide the link?


https://wiki.apache.org/solr/SolrPerformanceProblems#Asking_for_help_on_a_memory.2Fperformance_issue

Thanks,
Shawn



Help with slow retrieving data

2019-03-24 Thread Wendy2
Hi Solr users:I use Solr 7.3.1 and 150,000 documents and about 6GB in total.
When I try to retrieve 2 ids (4 letter code, indexed and stored), it
took 17s to retrieve 1.14M size data. I tried to increase RAM and cache, but
only helped to some degree (from 25s to 17s).  Any idea/suggestions where I
should look?   Thanks!  wget -O
output.txt
'http://localhost:8983/solr/s_entry/search?fl=pdb_id,score=human=0=2'
  
1.14M  66.7KB/sin 17s 



--
Sent from: http://lucene.472066.n3.nabble.com/Solr-User-f472068.html

Re: Need help on LTR

2019-03-22 Thread Kamuela Lau
I think the issue is that you store the feature as  originalScore but in
your model you refer to it as original_score

On Wed, Mar 20, 2019 at 1:58 PM Mohomed Rimash  wrote:

> one more thing i noticed is your feature params values doesn't wrap in q or
> qf field. check that as well
>
> On Wed, 20 Mar 2019 at 01:34, Amjad Khan  wrote:
>
> > Did, but same error
> >
> > {
> >   "responseHeader":{
> > "status":400,
> > "QTime":5},
> >   "error":{
> > "metadata":[
> >   "error-class","org.apache.solr.common.SolrException",
> >   "root-error-class","java.lang.NullPointerException"],
> > "msg":"org.apache.solr.ltr.model.ModelException: Model type does not
> > exist org.apache.solr.ltr.model.LinearModel",
> > "code":400}}
> >
> >
> >
> > > On Mar 19, 2019, at 3:26 PM, Mohomed Rimash 
> > wrote:
> > >
> > > Please update the weights values to greater than 0 and less than 1.
> > >
> > > On Wed, 20 Mar 2019 at 00:13, Amjad Khan  wrote:
> > >
> > >> Feature File
> > >> ===
> > >>
> > >> [
> > >>  {
> > >>"store" : "exampleFeatureStore",
> > >>"name" : "isCityName",
> > >>"class" : "org.apache.solr.ltr.feature.FieldValueFeature",
> > >>"params" : { "field" : "CITY_NAME" }
> > >>  },
> > >>  {
> > >>"store" : "exampleFeatureStore",
> > >>"name" : "originalScore",
> > >>"class" : "org.apache.solr.ltr.feature.OriginalScoreFeature",
> > >>"params" : {}
> > >>  },
> > >>  {
> > >>"store" : "exampleFeatureStore",
> > >>"name" : "isLat",
> > >>"class" : "org.apache.solr.ltr.feature.FieldValueFeature",
> > >>"params" : { "field" : "LATITUDE" }
> > >>  }
> > >> ]
> > >>
> > >> Model File
> > >> ==
> > >> {
> > >>  "store": "exampleFeatureStore",
> > >>  "class": "org.apache.solr.ltr.model.LinearModel",
> > >>  "name": "exampleModelStore",
> > >>  "features": [{
> > >>  "name": "isCityName"
> > >>},
> > >>{
> > >>  "name": "isLat"
> > >>},
> > >>{
> > >>  "name": "original_score"
> > >>}
> > >>  ],
> > >>  "params": {
> > >>"weights": {
> > >>  "isCityName": 0.0,
> > >>  "isLat": 0.0,
> > >>  "original_score": 1.0
> > >>}
> > >>  }
> > >> }
> > >>
> > >>
> > >>
> > >>> On Mar 19, 2019, at 2:04 PM, Mohomed Rimash 
> > >> wrote:
> > >>>
> > >>> Can you share the feature file and the model file,
> > >>> 1. I had few instances where invalid values for parameters (ie
> weights
> > >> set
> > >>> to more than 1 , with minmaxnormalizer) resulted the above error,
> > >>> 2, Check all the features added to the model has a weight under
> params
> > ->
> > >>> weights in the model
> > >>>
> > >>>
> > >>> On Tue, 19 Mar 2019 at 21:21, Roopa Rao  wrote:
> > >>>
> >  Does your feature definitions and the feature names used in the
> model
> >  match?
> > 
> >  On Tue, Mar 19, 2019 at 10:17 AM Amjad Khan 
> > >> wrote:
> > 
> > > Yes, I did.
> > >
> > > I can see the feature that I created by this
> > > schema/feature-store/exampleFeatureStore and it return me the
> > features
> > >> I
> > > created. But issue is when I try to put store-model.
> > >
> > >> On Mar 19, 2019, at 12:18 AM, Mohomed Rimash <
> rim...@yaalalabs.com>
> > > wrote:
> > >>
> > >> Hi Amjad, After adding the libraries into the path, Did you
> restart
> > >> the
> > >> SOLR ?
> > >>
> > >> On Tue, 19 Mar 2019 at 08:45, Amjad Khan 
> > wrote:
> > >>
> > >>> I followed the Solr LTR Documentation
> > >>>
> > >>> https://lucene.apache.org/solr/guide/7_4/learning-to-rank.html <
> > >>> https://lucene.apache.org/solr/guide/7_4/learning-to-rank.html>
> > >>>
> > >>> 1. Added library into the solr-config
> > >>> 
> > >>>  > >>> regex=".*\.jar" />
> > >>>  > >>> regex="solr-ltr-\d.*\.jar" />
> > >>> 2. Successfully added feature
> > >>> 3. Get schema to see feature is available
> > >>> 4. When I try to push model I see the error below, however I
> added
> > >> the
> > > lib
> > >>> into solr-cofig
> > >>>
> > >>> Response
> > >>> {
> > >>> "responseHeader":{
> > >>>  "status":400,
> > >>>  "QTime":1},
> > >>> "error":{
> > >>>  "metadata":[
> > >>>"error-class","org.apache.solr.common.SolrException",
> > >>>"root-error-class","java.lang.NullPointerException"],
> > >>>  "msg":"org.apache.solr.ltr.model.ModelException: Model type does
> >  not
> > >>> exist org.apache.solr.ltr.model.LinearModel",
> > >>>  "code":400}}
> > >>>
> > >>> Thanks
> > >
> > >
> > 
> > >>
> > >>
> >
> >
>


Re: Need help on LTR

2019-03-19 Thread Mohomed Rimash
one more thing i noticed is your feature params values doesn't wrap in q or
qf field. check that as well

On Wed, 20 Mar 2019 at 01:34, Amjad Khan  wrote:

> Did, but same error
>
> {
>   "responseHeader":{
> "status":400,
> "QTime":5},
>   "error":{
> "metadata":[
>   "error-class","org.apache.solr.common.SolrException",
>   "root-error-class","java.lang.NullPointerException"],
> "msg":"org.apache.solr.ltr.model.ModelException: Model type does not
> exist org.apache.solr.ltr.model.LinearModel",
> "code":400}}
>
>
>
> > On Mar 19, 2019, at 3:26 PM, Mohomed Rimash 
> wrote:
> >
> > Please update the weights values to greater than 0 and less than 1.
> >
> > On Wed, 20 Mar 2019 at 00:13, Amjad Khan  wrote:
> >
> >> Feature File
> >> ===
> >>
> >> [
> >>  {
> >>"store" : "exampleFeatureStore",
> >>"name" : "isCityName",
> >>"class" : "org.apache.solr.ltr.feature.FieldValueFeature",
> >>"params" : { "field" : "CITY_NAME" }
> >>  },
> >>  {
> >>"store" : "exampleFeatureStore",
> >>"name" : "originalScore",
> >>"class" : "org.apache.solr.ltr.feature.OriginalScoreFeature",
> >>"params" : {}
> >>  },
> >>  {
> >>"store" : "exampleFeatureStore",
> >>"name" : "isLat",
> >>"class" : "org.apache.solr.ltr.feature.FieldValueFeature",
> >>"params" : { "field" : "LATITUDE" }
> >>  }
> >> ]
> >>
> >> Model File
> >> ==
> >> {
> >>  "store": "exampleFeatureStore",
> >>  "class": "org.apache.solr.ltr.model.LinearModel",
> >>  "name": "exampleModelStore",
> >>  "features": [{
> >>  "name": "isCityName"
> >>},
> >>{
> >>  "name": "isLat"
> >>},
> >>{
> >>  "name": "original_score"
> >>}
> >>  ],
> >>  "params": {
> >>"weights": {
> >>  "isCityName": 0.0,
> >>  "isLat": 0.0,
> >>  "original_score": 1.0
> >>}
> >>  }
> >> }
> >>
> >>
> >>
> >>> On Mar 19, 2019, at 2:04 PM, Mohomed Rimash 
> >> wrote:
> >>>
> >>> Can you share the feature file and the model file,
> >>> 1. I had few instances where invalid values for parameters (ie weights
> >> set
> >>> to more than 1 , with minmaxnormalizer) resulted the above error,
> >>> 2, Check all the features added to the model has a weight under params
> ->
> >>> weights in the model
> >>>
> >>>
> >>> On Tue, 19 Mar 2019 at 21:21, Roopa Rao  wrote:
> >>>
>  Does your feature definitions and the feature names used in the model
>  match?
> 
>  On Tue, Mar 19, 2019 at 10:17 AM Amjad Khan 
> >> wrote:
> 
> > Yes, I did.
> >
> > I can see the feature that I created by this
> > schema/feature-store/exampleFeatureStore and it return me the
> features
> >> I
> > created. But issue is when I try to put store-model.
> >
> >> On Mar 19, 2019, at 12:18 AM, Mohomed Rimash 
> > wrote:
> >>
> >> Hi Amjad, After adding the libraries into the path, Did you restart
> >> the
> >> SOLR ?
> >>
> >> On Tue, 19 Mar 2019 at 08:45, Amjad Khan 
> wrote:
> >>
> >>> I followed the Solr LTR Documentation
> >>>
> >>> https://lucene.apache.org/solr/guide/7_4/learning-to-rank.html <
> >>> https://lucene.apache.org/solr/guide/7_4/learning-to-rank.html>
> >>>
> >>> 1. Added library into the solr-config
> >>> 
> >>>  >>> regex=".*\.jar" />
> >>>  >>> regex="solr-ltr-\d.*\.jar" />
> >>> 2. Successfully added feature
> >>> 3. Get schema to see feature is available
> >>> 4. When I try to push model I see the error below, however I added
> >> the
> > lib
> >>> into solr-cofig
> >>>
> >>> Response
> >>> {
> >>> "responseHeader":{
> >>>  "status":400,
> >>>  "QTime":1},
> >>> "error":{
> >>>  "metadata":[
> >>>"error-class","org.apache.solr.common.SolrException",
> >>>"root-error-class","java.lang.NullPointerException"],
> >>>  "msg":"org.apache.solr.ltr.model.ModelException: Model type does
>  not
> >>> exist org.apache.solr.ltr.model.LinearModel",
> >>>  "code":400}}
> >>>
> >>> Thanks
> >
> >
> 
> >>
> >>
>
>


Re: Need help on LTR

2019-03-19 Thread Roopa ML
In model file replace original_score with originalScore

Roopa

Sent from my iPhone

> On Mar 19, 2019, at 2:44 PM, Amjad Khan  wrote:
> 
> Roopa,
> 
> Yes
> 
>> On Mar 19, 2019, at 11:51 AM, Roopa Rao  wrote:
>> 
>> Does your feature definitions and the feature names used in the model match?
>> 
>>> On Tue, Mar 19, 2019 at 10:17 AM Amjad Khan  wrote:
>>> 
>>> Yes, I did.
>>> 
>>> I can see the feature that I created by this
>>> schema/feature-store/exampleFeatureStore and it return me the features I
>>> created. But issue is when I try to put store-model.
>>> 
 On Mar 19, 2019, at 12:18 AM, Mohomed Rimash 
>>> wrote:
 
 Hi Amjad, After adding the libraries into the path, Did you restart the
 SOLR ?
 
> On Tue, 19 Mar 2019 at 08:45, Amjad Khan  wrote:
> 
> I followed the Solr LTR Documentation
> 
> https://lucene.apache.org/solr/guide/7_4/learning-to-rank.html <
> https://lucene.apache.org/solr/guide/7_4/learning-to-rank.html>
> 
> 1. Added library into the solr-config
> 
>  regex=".*\.jar" />
>  regex="solr-ltr-\d.*\.jar" />
> 2. Successfully added feature
> 3. Get schema to see feature is available
> 4. When I try to push model I see the error below, however I added the
>>> lib
> into solr-cofig
> 
> Response
> {
> "responseHeader":{
>  "status":400,
>  "QTime":1},
> "error":{
>  "metadata":[
>"error-class","org.apache.solr.common.SolrException",
>"root-error-class","java.lang.NullPointerException"],
>  "msg":"org.apache.solr.ltr.model.ModelException: Model type does not
> exist org.apache.solr.ltr.model.LinearModel",
>  "code":400}}
> 
> Thanks
>>> 
>>> 
> 


Re: Need help on LTR

2019-03-19 Thread Amjad Khan
Did, but same error

{
  "responseHeader":{
"status":400,
"QTime":5},
  "error":{
"metadata":[
  "error-class","org.apache.solr.common.SolrException",
  "root-error-class","java.lang.NullPointerException"],
"msg":"org.apache.solr.ltr.model.ModelException: Model type does not exist 
org.apache.solr.ltr.model.LinearModel",
"code":400}}



> On Mar 19, 2019, at 3:26 PM, Mohomed Rimash  wrote:
> 
> Please update the weights values to greater than 0 and less than 1.
> 
> On Wed, 20 Mar 2019 at 00:13, Amjad Khan  wrote:
> 
>> Feature File
>> ===
>> 
>> [
>>  {
>>"store" : "exampleFeatureStore",
>>"name" : "isCityName",
>>"class" : "org.apache.solr.ltr.feature.FieldValueFeature",
>>"params" : { "field" : "CITY_NAME" }
>>  },
>>  {
>>"store" : "exampleFeatureStore",
>>"name" : "originalScore",
>>"class" : "org.apache.solr.ltr.feature.OriginalScoreFeature",
>>"params" : {}
>>  },
>>  {
>>"store" : "exampleFeatureStore",
>>"name" : "isLat",
>>"class" : "org.apache.solr.ltr.feature.FieldValueFeature",
>>"params" : { "field" : "LATITUDE" }
>>  }
>> ]
>> 
>> Model File
>> ==
>> {
>>  "store": "exampleFeatureStore",
>>  "class": "org.apache.solr.ltr.model.LinearModel",
>>  "name": "exampleModelStore",
>>  "features": [{
>>  "name": "isCityName"
>>},
>>{
>>  "name": "isLat"
>>},
>>{
>>  "name": "original_score"
>>}
>>  ],
>>  "params": {
>>"weights": {
>>  "isCityName": 0.0,
>>  "isLat": 0.0,
>>  "original_score": 1.0
>>}
>>  }
>> }
>> 
>> 
>> 
>>> On Mar 19, 2019, at 2:04 PM, Mohomed Rimash 
>> wrote:
>>> 
>>> Can you share the feature file and the model file,
>>> 1. I had few instances where invalid values for parameters (ie weights
>> set
>>> to more than 1 , with minmaxnormalizer) resulted the above error,
>>> 2, Check all the features added to the model has a weight under params ->
>>> weights in the model
>>> 
>>> 
>>> On Tue, 19 Mar 2019 at 21:21, Roopa Rao  wrote:
>>> 
 Does your feature definitions and the feature names used in the model
 match?
 
 On Tue, Mar 19, 2019 at 10:17 AM Amjad Khan 
>> wrote:
 
> Yes, I did.
> 
> I can see the feature that I created by this
> schema/feature-store/exampleFeatureStore and it return me the features
>> I
> created. But issue is when I try to put store-model.
> 
>> On Mar 19, 2019, at 12:18 AM, Mohomed Rimash 
> wrote:
>> 
>> Hi Amjad, After adding the libraries into the path, Did you restart
>> the
>> SOLR ?
>> 
>> On Tue, 19 Mar 2019 at 08:45, Amjad Khan  wrote:
>> 
>>> I followed the Solr LTR Documentation
>>> 
>>> https://lucene.apache.org/solr/guide/7_4/learning-to-rank.html <
>>> https://lucene.apache.org/solr/guide/7_4/learning-to-rank.html>
>>> 
>>> 1. Added library into the solr-config
>>> 
>>> >> regex=".*\.jar" />
>>> >> regex="solr-ltr-\d.*\.jar" />
>>> 2. Successfully added feature
>>> 3. Get schema to see feature is available
>>> 4. When I try to push model I see the error below, however I added
>> the
> lib
>>> into solr-cofig
>>> 
>>> Response
>>> {
>>> "responseHeader":{
>>>  "status":400,
>>>  "QTime":1},
>>> "error":{
>>>  "metadata":[
>>>"error-class","org.apache.solr.common.SolrException",
>>>"root-error-class","java.lang.NullPointerException"],
>>>  "msg":"org.apache.solr.ltr.model.ModelException: Model type does
 not
>>> exist org.apache.solr.ltr.model.LinearModel",
>>>  "code":400}}
>>> 
>>> Thanks
> 
> 
 
>> 
>> 



Re: Need help on LTR

2019-03-19 Thread Mohomed Rimash
Please update the weights values to greater than 0 and less than 1.

On Wed, 20 Mar 2019 at 00:13, Amjad Khan  wrote:

> Feature File
> ===
>
> [
>   {
> "store" : "exampleFeatureStore",
> "name" : "isCityName",
> "class" : "org.apache.solr.ltr.feature.FieldValueFeature",
> "params" : { "field" : "CITY_NAME" }
>   },
>   {
> "store" : "exampleFeatureStore",
> "name" : "originalScore",
> "class" : "org.apache.solr.ltr.feature.OriginalScoreFeature",
> "params" : {}
>   },
>   {
> "store" : "exampleFeatureStore",
> "name" : "isLat",
> "class" : "org.apache.solr.ltr.feature.FieldValueFeature",
> "params" : { "field" : "LATITUDE" }
>   }
> ]
>
> Model File
> ==
> {
>   "store": "exampleFeatureStore",
>   "class": "org.apache.solr.ltr.model.LinearModel",
>   "name": "exampleModelStore",
>   "features": [{
>   "name": "isCityName"
> },
> {
>   "name": "isLat"
> },
> {
>   "name": "original_score"
> }
>   ],
>   "params": {
> "weights": {
>   "isCityName": 0.0,
>   "isLat": 0.0,
>   "original_score": 1.0
> }
>   }
> }
>
>
>
> > On Mar 19, 2019, at 2:04 PM, Mohomed Rimash 
> wrote:
> >
> > Can you share the feature file and the model file,
> > 1. I had few instances where invalid values for parameters (ie weights
> set
> > to more than 1 , with minmaxnormalizer) resulted the above error,
> > 2, Check all the features added to the model has a weight under params ->
> > weights in the model
> >
> >
> > On Tue, 19 Mar 2019 at 21:21, Roopa Rao  wrote:
> >
> >> Does your feature definitions and the feature names used in the model
> >> match?
> >>
> >> On Tue, Mar 19, 2019 at 10:17 AM Amjad Khan 
> wrote:
> >>
> >>> Yes, I did.
> >>>
> >>> I can see the feature that I created by this
> >>> schema/feature-store/exampleFeatureStore and it return me the features
> I
> >>> created. But issue is when I try to put store-model.
> >>>
>  On Mar 19, 2019, at 12:18 AM, Mohomed Rimash 
> >>> wrote:
> 
>  Hi Amjad, After adding the libraries into the path, Did you restart
> the
>  SOLR ?
> 
>  On Tue, 19 Mar 2019 at 08:45, Amjad Khan  wrote:
> 
> > I followed the Solr LTR Documentation
> >
> > https://lucene.apache.org/solr/guide/7_4/learning-to-rank.html <
> > https://lucene.apache.org/solr/guide/7_4/learning-to-rank.html>
> >
> > 1. Added library into the solr-config
> > 
> >  > regex=".*\.jar" />
> >  > regex="solr-ltr-\d.*\.jar" />
> > 2. Successfully added feature
> > 3. Get schema to see feature is available
> > 4. When I try to push model I see the error below, however I added
> the
> >>> lib
> > into solr-cofig
> >
> > Response
> > {
> > "responseHeader":{
> >   "status":400,
> >   "QTime":1},
> > "error":{
> >   "metadata":[
> > "error-class","org.apache.solr.common.SolrException",
> > "root-error-class","java.lang.NullPointerException"],
> >   "msg":"org.apache.solr.ltr.model.ModelException: Model type does
> >> not
> > exist org.apache.solr.ltr.model.LinearModel",
> >   "code":400}}
> >
> > Thanks
> >>>
> >>>
> >>
>
>


Re: Need help on LTR

2019-03-19 Thread Amjad Khan
Roopa,

Yes

> On Mar 19, 2019, at 11:51 AM, Roopa Rao  wrote:
> 
> Does your feature definitions and the feature names used in the model match?
> 
> On Tue, Mar 19, 2019 at 10:17 AM Amjad Khan  wrote:
> 
>> Yes, I did.
>> 
>> I can see the feature that I created by this
>> schema/feature-store/exampleFeatureStore and it return me the features I
>> created. But issue is when I try to put store-model.
>> 
>>> On Mar 19, 2019, at 12:18 AM, Mohomed Rimash 
>> wrote:
>>> 
>>> Hi Amjad, After adding the libraries into the path, Did you restart the
>>> SOLR ?
>>> 
>>> On Tue, 19 Mar 2019 at 08:45, Amjad Khan  wrote:
>>> 
 I followed the Solr LTR Documentation
 
 https://lucene.apache.org/solr/guide/7_4/learning-to-rank.html <
 https://lucene.apache.org/solr/guide/7_4/learning-to-rank.html>
 
 1. Added library into the solr-config
 
 >>> regex=".*\.jar" />
 >>> regex="solr-ltr-\d.*\.jar" />
 2. Successfully added feature
 3. Get schema to see feature is available
 4. When I try to push model I see the error below, however I added the
>> lib
 into solr-cofig
 
 Response
 {
 "responseHeader":{
   "status":400,
   "QTime":1},
 "error":{
   "metadata":[
 "error-class","org.apache.solr.common.SolrException",
 "root-error-class","java.lang.NullPointerException"],
   "msg":"org.apache.solr.ltr.model.ModelException: Model type does not
 exist org.apache.solr.ltr.model.LinearModel",
   "code":400}}
 
 Thanks
>> 
>> 



Re: Need help on LTR

2019-03-19 Thread Amjad Khan
Feature File
===

[
  {
"store" : "exampleFeatureStore",
"name" : "isCityName",
"class" : "org.apache.solr.ltr.feature.FieldValueFeature",
"params" : { "field" : "CITY_NAME" }
  },
  {
"store" : "exampleFeatureStore",
"name" : "originalScore",
"class" : "org.apache.solr.ltr.feature.OriginalScoreFeature",
"params" : {}
  },
  {
"store" : "exampleFeatureStore",
"name" : "isLat",
"class" : "org.apache.solr.ltr.feature.FieldValueFeature",
"params" : { "field" : "LATITUDE" }
  }
]

Model File
==
{
  "store": "exampleFeatureStore",
  "class": "org.apache.solr.ltr.model.LinearModel",
  "name": "exampleModelStore",
  "features": [{
  "name": "isCityName"
},
{
  "name": "isLat"
},
{
  "name": "original_score"
}
  ],
  "params": {
"weights": {
  "isCityName": 0.0,
  "isLat": 0.0,
  "original_score": 1.0
}
  }
}



> On Mar 19, 2019, at 2:04 PM, Mohomed Rimash  wrote:
> 
> Can you share the feature file and the model file,
> 1. I had few instances where invalid values for parameters (ie weights set
> to more than 1 , with minmaxnormalizer) resulted the above error,
> 2, Check all the features added to the model has a weight under params ->
> weights in the model
> 
> 
> On Tue, 19 Mar 2019 at 21:21, Roopa Rao  wrote:
> 
>> Does your feature definitions and the feature names used in the model
>> match?
>> 
>> On Tue, Mar 19, 2019 at 10:17 AM Amjad Khan  wrote:
>> 
>>> Yes, I did.
>>> 
>>> I can see the feature that I created by this
>>> schema/feature-store/exampleFeatureStore and it return me the features I
>>> created. But issue is when I try to put store-model.
>>> 
 On Mar 19, 2019, at 12:18 AM, Mohomed Rimash 
>>> wrote:
 
 Hi Amjad, After adding the libraries into the path, Did you restart the
 SOLR ?
 
 On Tue, 19 Mar 2019 at 08:45, Amjad Khan  wrote:
 
> I followed the Solr LTR Documentation
> 
> https://lucene.apache.org/solr/guide/7_4/learning-to-rank.html <
> https://lucene.apache.org/solr/guide/7_4/learning-to-rank.html>
> 
> 1. Added library into the solr-config
> 
>  regex=".*\.jar" />
>  regex="solr-ltr-\d.*\.jar" />
> 2. Successfully added feature
> 3. Get schema to see feature is available
> 4. When I try to push model I see the error below, however I added the
>>> lib
> into solr-cofig
> 
> Response
> {
> "responseHeader":{
>   "status":400,
>   "QTime":1},
> "error":{
>   "metadata":[
> "error-class","org.apache.solr.common.SolrException",
> "root-error-class","java.lang.NullPointerException"],
>   "msg":"org.apache.solr.ltr.model.ModelException: Model type does
>> not
> exist org.apache.solr.ltr.model.LinearModel",
>   "code":400}}
> 
> Thanks
>>> 
>>> 
>> 



Re: Need help on LTR

2019-03-19 Thread Mohomed Rimash
Can you share the feature file and the model file,
1. I had few instances where invalid values for parameters (ie weights set
to more than 1 , with minmaxnormalizer) resulted the above error,
2, Check all the features added to the model has a weight under params ->
weights in the model


On Tue, 19 Mar 2019 at 21:21, Roopa Rao  wrote:

> Does your feature definitions and the feature names used in the model
> match?
>
> On Tue, Mar 19, 2019 at 10:17 AM Amjad Khan  wrote:
>
> > Yes, I did.
> >
> > I can see the feature that I created by this
> > schema/feature-store/exampleFeatureStore and it return me the features I
> > created. But issue is when I try to put store-model.
> >
> > > On Mar 19, 2019, at 12:18 AM, Mohomed Rimash 
> > wrote:
> > >
> > > Hi Amjad, After adding the libraries into the path, Did you restart the
> > > SOLR ?
> > >
> > > On Tue, 19 Mar 2019 at 08:45, Amjad Khan  wrote:
> > >
> > >> I followed the Solr LTR Documentation
> > >>
> > >> https://lucene.apache.org/solr/guide/7_4/learning-to-rank.html <
> > >> https://lucene.apache.org/solr/guide/7_4/learning-to-rank.html>
> > >>
> > >> 1. Added library into the solr-config
> > >> 
> > >>   > >> regex=".*\.jar" />
> > >>  > >> regex="solr-ltr-\d.*\.jar" />
> > >> 2. Successfully added feature
> > >> 3. Get schema to see feature is available
> > >> 4. When I try to push model I see the error below, however I added the
> > lib
> > >> into solr-cofig
> > >>
> > >> Response
> > >> {
> > >>  "responseHeader":{
> > >>"status":400,
> > >>"QTime":1},
> > >>  "error":{
> > >>"metadata":[
> > >>  "error-class","org.apache.solr.common.SolrException",
> > >>  "root-error-class","java.lang.NullPointerException"],
> > >>"msg":"org.apache.solr.ltr.model.ModelException: Model type does
> not
> > >> exist org.apache.solr.ltr.model.LinearModel",
> > >>"code":400}}
> > >>
> > >> Thanks
> >
> >
>


Re: Need help on LTR

2019-03-19 Thread Roopa Rao
Does your feature definitions and the feature names used in the model match?

On Tue, Mar 19, 2019 at 10:17 AM Amjad Khan  wrote:

> Yes, I did.
>
> I can see the feature that I created by this
> schema/feature-store/exampleFeatureStore and it return me the features I
> created. But issue is when I try to put store-model.
>
> > On Mar 19, 2019, at 12:18 AM, Mohomed Rimash 
> wrote:
> >
> > Hi Amjad, After adding the libraries into the path, Did you restart the
> > SOLR ?
> >
> > On Tue, 19 Mar 2019 at 08:45, Amjad Khan  wrote:
> >
> >> I followed the Solr LTR Documentation
> >>
> >> https://lucene.apache.org/solr/guide/7_4/learning-to-rank.html <
> >> https://lucene.apache.org/solr/guide/7_4/learning-to-rank.html>
> >>
> >> 1. Added library into the solr-config
> >> 
> >>   >> regex=".*\.jar" />
> >>  >> regex="solr-ltr-\d.*\.jar" />
> >> 2. Successfully added feature
> >> 3. Get schema to see feature is available
> >> 4. When I try to push model I see the error below, however I added the
> lib
> >> into solr-cofig
> >>
> >> Response
> >> {
> >>  "responseHeader":{
> >>"status":400,
> >>"QTime":1},
> >>  "error":{
> >>"metadata":[
> >>  "error-class","org.apache.solr.common.SolrException",
> >>  "root-error-class","java.lang.NullPointerException"],
> >>"msg":"org.apache.solr.ltr.model.ModelException: Model type does not
> >> exist org.apache.solr.ltr.model.LinearModel",
> >>"code":400}}
> >>
> >> Thanks
>
>


Re: Need help on LTR

2019-03-19 Thread Amjad Khan
Yes, I did.

I can see the feature that I created by this 
schema/feature-store/exampleFeatureStore and it return me the features I 
created. But issue is when I try to put store-model.

> On Mar 19, 2019, at 12:18 AM, Mohomed Rimash  wrote:
> 
> Hi Amjad, After adding the libraries into the path, Did you restart the
> SOLR ?
> 
> On Tue, 19 Mar 2019 at 08:45, Amjad Khan  wrote:
> 
>> I followed the Solr LTR Documentation
>> 
>> https://lucene.apache.org/solr/guide/7_4/learning-to-rank.html <
>> https://lucene.apache.org/solr/guide/7_4/learning-to-rank.html>
>> 
>> 1. Added library into the solr-config
>> 
>>  > regex=".*\.jar" />
>> > regex="solr-ltr-\d.*\.jar" />
>> 2. Successfully added feature
>> 3. Get schema to see feature is available
>> 4. When I try to push model I see the error below, however I added the lib
>> into solr-cofig
>> 
>> Response
>> {
>>  "responseHeader":{
>>"status":400,
>>"QTime":1},
>>  "error":{
>>"metadata":[
>>  "error-class","org.apache.solr.common.SolrException",
>>  "root-error-class","java.lang.NullPointerException"],
>>"msg":"org.apache.solr.ltr.model.ModelException: Model type does not
>> exist org.apache.solr.ltr.model.LinearModel",
>>"code":400}}
>> 
>> Thanks



Re: Need help on LTR

2019-03-19 Thread Amjad Khan
Hi,

Yes, I did restarted the solr server with this JVM param.

> On Mar 19, 2019, at 3:35 AM, Jörn Franke  wrote:
> 
> Did you add the option -Dsolr.ltr.enabled=true ?
> 
>> Am 19.03.2019 um 04:15 schrieb Amjad Khan :
>> 
>> I followed the Solr LTR Documentation 
>> 
>> https://lucene.apache.org/solr/guide/7_4/learning-to-rank.html 
>> 
>> 
>> 1. Added library into the solr-config
>> 
>> > />
>> > />
>> 2. Successfully added feature
>> 3. Get schema to see feature is available
>> 4. When I try to push model I see the error below, however I added the lib 
>> into solr-cofig
>> 
>> Response
>> {
>> "responseHeader":{
>>   "status":400,
>>   "QTime":1},
>> "error":{
>>   "metadata":[
>> "error-class","org.apache.solr.common.SolrException",
>> "root-error-class","java.lang.NullPointerException"],
>>   "msg":"org.apache.solr.ltr.model.ModelException: Model type does not exist 
>> org.apache.solr.ltr.model.LinearModel",
>>   "code":400}}
>> 
>> Thanks



Re: Need help on LTR

2019-03-19 Thread Jörn Franke
Did you add the option -Dsolr.ltr.enabled=true ?

> Am 19.03.2019 um 04:15 schrieb Amjad Khan :
> 
> I followed the Solr LTR Documentation 
> 
> https://lucene.apache.org/solr/guide/7_4/learning-to-rank.html 
> 
> 
> 1. Added library into the solr-config
> 
>   />
> 
> 2. Successfully added feature
> 3. Get schema to see feature is available
> 4. When I try to push model I see the error below, however I added the lib 
> into solr-cofig
> 
> Response
> {
>  "responseHeader":{
>"status":400,
>"QTime":1},
>  "error":{
>"metadata":[
>  "error-class","org.apache.solr.common.SolrException",
>  "root-error-class","java.lang.NullPointerException"],
>"msg":"org.apache.solr.ltr.model.ModelException: Model type does not exist 
> org.apache.solr.ltr.model.LinearModel",
>"code":400}}
> 
> Thanks


Re: Need help on LTR

2019-03-18 Thread Mohomed Rimash
Hi Amjad, After adding the libraries into the path, Did you restart the
SOLR ?

On Tue, 19 Mar 2019 at 08:45, Amjad Khan  wrote:

> I followed the Solr LTR Documentation
>
> https://lucene.apache.org/solr/guide/7_4/learning-to-rank.html <
> https://lucene.apache.org/solr/guide/7_4/learning-to-rank.html>
>
> 1. Added library into the solr-config
> 
>regex=".*\.jar" />
>  regex="solr-ltr-\d.*\.jar" />
> 2. Successfully added feature
> 3. Get schema to see feature is available
> 4. When I try to push model I see the error below, however I added the lib
> into solr-cofig
>
> Response
> {
>   "responseHeader":{
> "status":400,
> "QTime":1},
>   "error":{
> "metadata":[
>   "error-class","org.apache.solr.common.SolrException",
>   "root-error-class","java.lang.NullPointerException"],
> "msg":"org.apache.solr.ltr.model.ModelException: Model type does not
> exist org.apache.solr.ltr.model.LinearModel",
> "code":400}}
>
> Thanks


Need help on LTR

2019-03-18 Thread Amjad Khan
I followed the Solr LTR Documentation 

https://lucene.apache.org/solr/guide/7_4/learning-to-rank.html 


1. Added library into the solr-config

  

2. Successfully added feature
3. Get schema to see feature is available
4. When I try to push model I see the error below, however I added the lib into 
solr-cofig

Response
{
  "responseHeader":{
"status":400,
"QTime":1},
  "error":{
"metadata":[
  "error-class","org.apache.solr.common.SolrException",
  "root-error-class","java.lang.NullPointerException"],
"msg":"org.apache.solr.ltr.model.ModelException: Model type does not exist 
org.apache.solr.ltr.model.LinearModel",
"code":400}}

Thanks

Re: Help with a DIH config file

2019-03-16 Thread Jörn Franke
You have to specify the option recursive=true on the entity files

On Fri, Mar 15, 2019 at 7:59 PM wclarke  wrote:

> One last question.
>
> I have everything running as it should finally.  However, when I pull out
> of
> testing to do the entire directory it's just cycling through.  The
> directory
> is full of folders that have the documents in them.  Do I need an html or
> other file sitting in there randomly to get it to start recursion through
> the folders?  I am attaching my dih config to see the single change I made
> to the base directory.  Am I just being impatient and it will eventually
> start going in the folders?
>
> Thanks! tika-data-config-2.xml
> 
>
>
>
> --
> Sent from: http://lucene.472066.n3.nabble.com/Solr-User-f472068.html
>


Re: Help with a DIH config file

2019-03-16 Thread Jörn Franke
https://lucene.apache.org/solr/guide/6_6/uploading-structured-data-store-data-with-the-data-import-handler.html#the-filelistentityprocessor

On Sun, Mar 17, 2019 at 1:32 AM Jörn Franke  wrote:

> You have to specify the option recursive=true on the entity files
>
> On Fri, Mar 15, 2019 at 7:59 PM wclarke  wrote:
>
>> One last question.
>>
>> I have everything running as it should finally.  However, when I pull out
>> of
>> testing to do the entire directory it's just cycling through.  The
>> directory
>> is full of folders that have the documents in them.  Do I need an html or
>> other file sitting in there randomly to get it to start recursion through
>> the folders?  I am attaching my dih config to see the single change I made
>> to the base directory.  Am I just being impatient and it will eventually
>> start going in the folders?
>>
>> Thanks! tika-data-config-2.xml
>> 
>>
>>
>>
>>
>> --
>> Sent from: http://lucene.472066.n3.nabble.com/Solr-User-f472068.html
>>
>


Re: Help with a DIH config file

2019-03-15 Thread wclarke
One last question.

I have everything running as it should finally.  However, when I pull out of
testing to do the entire directory it's just cycling through.  The directory
is full of folders that have the documents in them.  Do I need an html or
other file sitting in there randomly to get it to start recursion through
the folders?  I am attaching my dih config to see the single change I made
to the base directory.  Am I just being impatient and it will eventually
start going in the folders?

Thanks! tika-data-config-2.xml
  



--
Sent from: http://lucene.472066.n3.nabble.com/Solr-User-f472068.html


Re: Help with a DIH config file

2019-03-15 Thread wclarke
Thanks! that fixed it.



--
Sent from: http://lucene.472066.n3.nabble.com/Solr-User-f472068.html


Re: Help with a DIH config file

2019-03-15 Thread Tim Allison
Haha, looks like Jörn just answered this... onError="skip|continue"

>greatly preferable if the indexing process could ignore exceptions
Please, no.  I'm 100% behind the sentiment that DIH should gracefully
handle Tika exceptions, but the better option is to log the
exceptions, store the stacktraces and report your high priority
problems to Apache Tika and/or its dependencies so that we can fix
them.  Try running tika-eval[0] against a subset of your docs,
perhaps.

That said, DIH's integration with Tika is not intended for robust
production use.  It is intended to get people up to speed quickly and,
effectively, for demo purposes.  I recognize that it is being used in
production around the world, but it really shouldn't be.

See Erick Erickson's[1]:
>But, i wouldn’t really recommend that you just ship the docs to Solr, I’d 
>recommend that you build a little program to do the extraction on one or more 
>clients, the details of why are here:

>https://lucidworks.com/2012/02/14/indexing-with-solrj/

[0] https://wiki.apache.org/tika/TikaEval
[1] 
https://mail-archives.apache.org/mod_mbox/lucene-solr-user/201903.mbox/ajax/%3CF2034803-D4A8-48E1-889A-DA9E44961EE6%40gmail.com%3E

On Fri, Mar 15, 2019 at 7:44 AM Demian Katz  wrote:
>
> Jörn (and anyone else with more experience with this than I have),
>
> I've been working on Whitney with this issue. It is a PDF file, and it can be 
> opened successfully in a PDF reader. Interestingly, if I try to extract data 
> from it on the command line, Tika version 1.3 throws a lot of warnings but 
> does successfully extract data, but several newer versions, including 1.17 
> and 1.20 (haven't tested other intermediate versions) encounter a fatal error 
> and extract nothing. So this seems like something that used to work but has 
> stopped. Unfortunately, we haven't been able to find a way to downgrade to an 
> old enough Tika in her Solr installation to work around the problem that way.
>
> The bigger question, though, is whether there's a way to allow the DIH to 
> simply ignore errors and keep going. Whitney needs to index several terabytes 
> of arbitrary documents for her project, and at this scale, she can't afford 
> the time to stop and manually intervene for every strange document that 
> happens to be in the collection. It would be greatly preferable if the 
> indexing process could ignore exceptions and proceed on than if it just stops 
> dead at the first problem. (I'm also pretty sure that Whitney is already 
> using the ignoreTikaException attribute in her configuration, but it doesn't 
> seem to help in this instance).
>
> Any suggestions would be greatly appreciated!
>
> thanks,
> Demian
>
> -Original Message-
> From: Jörn Franke 
> Sent: Friday, March 15, 2019 4:18 AM
> To: solr-user@lucene.apache.org
> Subject: Re: Help with a DIH config file
>
> Do you have an exception?
> It could be that the pdf is broken - can you open it on your computer with a 
> pdfreader?
>
> If the exception is related to Tika and pdf then file an issue with the 
> pdfbox project. If there is an issue with Tika and MsOffice documents then 
> Apache poi is the right project to ask.
>
> > Am 15.03.2019 um 03:41 schrieb wclarke :
> >
> > Thank you so much.  You helped a great deal.  I am running into one
> > last issue where the Tika DIH is stopping at a specific language and
> > fails there (Malayalam).  Do you know of a work around?
> >
> >
> >
> > --
> > Sent from:
> > https://nam04.safelinks.protection.outlook.com/?url=http%3A%2F%2Flucen
> > e.472066.n3.nabble.com%2FSolr-User-f472068.htmldata=02%7C01%7Cdem
> > ian.katz%40villanova.edu%7Ca54d5daee7b14648442908d6a91f9bf6%7C765a8de5
> > cf9444f09cafae5bf8cfa366%7C0%7C0%7C636882350564627071sdata=NpddZY
> > 2sHKJHAR8V%2BIlMt4j1i3oy94KP9%2Btp1EQ2xM4%3Dreserved=0


Re: Help with a DIH config file

2019-03-15 Thread Jörn Franke
In the Tika entity processor use the option onError=“skip”

Alternatives are abort (default) or continue (behave as nothing would have 
happened)

Skip skips the current document 

> Am 15.03.2019 um 12:44 schrieb Demian Katz :
> 
> Jörn (and anyone else with more experience with this than I have),
> 
> I've been working on Whitney with this issue. It is a PDF file, and it can be 
> opened successfully in a PDF reader. Interestingly, if I try to extract data 
> from it on the command line, Tika version 1.3 throws a lot of warnings but 
> does successfully extract data, but several newer versions, including 1.17 
> and 1.20 (haven't tested other intermediate versions) encounter a fatal error 
> and extract nothing. So this seems like something that used to work but has 
> stopped. Unfortunately, we haven't been able to find a way to downgrade to an 
> old enough Tika in her Solr installation to work around the problem that way.
> 
> The bigger question, though, is whether there's a way to allow the DIH to 
> simply ignore errors and keep going. Whitney needs to index several terabytes 
> of arbitrary documents for her project, and at this scale, she can't afford 
> the time to stop and manually intervene for every strange document that 
> happens to be in the collection. It would be greatly preferable if the 
> indexing process could ignore exceptions and proceed on than if it just stops 
> dead at the first problem. (I'm also pretty sure that Whitney is already 
> using the ignoreTikaException attribute in her configuration, but it doesn't 
> seem to help in this instance).
> 
> Any suggestions would be greatly appreciated!
> 
> thanks,
> Demian
> 
> -Original Message-
> From: Jörn Franke  
> Sent: Friday, March 15, 2019 4:18 AM
> To: solr-user@lucene.apache.org
> Subject: Re: Help with a DIH config file
> 
> Do you have an exception?
> It could be that the pdf is broken - can you open it on your computer with a 
> pdfreader?
> 
> If the exception is related to Tika and pdf then file an issue with the 
> pdfbox project. If there is an issue with Tika and MsOffice documents then 
> Apache poi is the right project to ask.
> 
>> Am 15.03.2019 um 03:41 schrieb wclarke :
>> 
>> Thank you so much.  You helped a great deal.  I am running into one 
>> last issue where the Tika DIH is stopping at a specific language and 
>> fails there (Malayalam).  Do you know of a work around?
>> 
>> 
>> 
>> --
>> Sent from: 
>> https://nam04.safelinks.protection.outlook.com/?url=http%3A%2F%2Flucen
>> e.472066.n3.nabble.com%2FSolr-User-f472068.htmldata=02%7C01%7Cdem
>> ian.katz%40villanova.edu%7Ca54d5daee7b14648442908d6a91f9bf6%7C765a8de5
>> cf9444f09cafae5bf8cfa366%7C0%7C0%7C636882350564627071sdata=NpddZY
>> 2sHKJHAR8V%2BIlMt4j1i3oy94KP9%2Btp1EQ2xM4%3Dreserved=0


RE: Help with a DIH config file

2019-03-15 Thread Demian Katz
Jörn (and anyone else with more experience with this than I have),

I've been working on Whitney with this issue. It is a PDF file, and it can be 
opened successfully in a PDF reader. Interestingly, if I try to extract data 
from it on the command line, Tika version 1.3 throws a lot of warnings but does 
successfully extract data, but several newer versions, including 1.17 and 1.20 
(haven't tested other intermediate versions) encounter a fatal error and 
extract nothing. So this seems like something that used to work but has 
stopped. Unfortunately, we haven't been able to find a way to downgrade to an 
old enough Tika in her Solr installation to work around the problem that way.

The bigger question, though, is whether there's a way to allow the DIH to 
simply ignore errors and keep going. Whitney needs to index several terabytes 
of arbitrary documents for her project, and at this scale, she can't afford the 
time to stop and manually intervene for every strange document that happens to 
be in the collection. It would be greatly preferable if the indexing process 
could ignore exceptions and proceed on than if it just stops dead at the first 
problem. (I'm also pretty sure that Whitney is already using the 
ignoreTikaException attribute in her configuration, but it doesn't seem to help 
in this instance).

Any suggestions would be greatly appreciated!

thanks,
Demian

-Original Message-
From: Jörn Franke  
Sent: Friday, March 15, 2019 4:18 AM
To: solr-user@lucene.apache.org
Subject: Re: Help with a DIH config file

Do you have an exception?
It could be that the pdf is broken - can you open it on your computer with a 
pdfreader?

If the exception is related to Tika and pdf then file an issue with the pdfbox 
project. If there is an issue with Tika and MsOffice documents then Apache poi 
is the right project to ask.

> Am 15.03.2019 um 03:41 schrieb wclarke :
> 
> Thank you so much.  You helped a great deal.  I am running into one 
> last issue where the Tika DIH is stopping at a specific language and 
> fails there (Malayalam).  Do you know of a work around?
> 
> 
> 
> --
> Sent from: 
> https://nam04.safelinks.protection.outlook.com/?url=http%3A%2F%2Flucen
> e.472066.n3.nabble.com%2FSolr-User-f472068.htmldata=02%7C01%7Cdem
> ian.katz%40villanova.edu%7Ca54d5daee7b14648442908d6a91f9bf6%7C765a8de5
> cf9444f09cafae5bf8cfa366%7C0%7C0%7C636882350564627071sdata=NpddZY
> 2sHKJHAR8V%2BIlMt4j1i3oy94KP9%2Btp1EQ2xM4%3Dreserved=0


Re: Help with a DIH config file

2019-03-15 Thread Jörn Franke
Do you have an exception?
It could be that the pdf is broken - can you open it on your computer with a 
pdfreader?

If the exception is related to Tika and pdf then file an issue with the pdfbox 
project. If there is an issue with Tika and MsOffice documents then Apache poi 
is the right project to ask.

> Am 15.03.2019 um 03:41 schrieb wclarke :
> 
> Thank you so much.  You helped a great deal.  I am running into one last
> issue where the Tika DIH is stopping at a specific language and fails there
> (Malayalam).  Do you know of a work around?
> 
> 
> 
> --
> Sent from: http://lucene.472066.n3.nabble.com/Solr-User-f472068.html


Re: Help with a DIH config file

2019-03-15 Thread wclarke
Thank you so much.  You helped a great deal.  I am running into one last
issue where the Tika DIH is stopping at a specific language and fails there
(Malayalam).  Do you know of a work around?



--
Sent from: http://lucene.472066.n3.nabble.com/Solr-User-f472068.html


Re: Help with a DIH config file

2019-03-14 Thread Jörn Franke
sorry for my late reply. thanks for sharing

yes this is possible.

maybe my last mail were confusing. I hope the examples below help

Alternative 1 - Use only DIH without update processor
tika-data-config-2xml - add transformer in entity and the transformation in
field (here done for id and for fulltext) - additioanlly set
TikaEntityProcessor format to "text":























Alternative 2 - Regex processor in solrconfig.xml - you need to put
everything into ONE chain

  _text_ fulltext


_text_
fulltext
\n|\r

true



id
url
[^\w|\.]
/
true






[..]


tika-data-config-2.xml
my-chain



On Thu, Mar 14, 2019 at 6:41 AM wclarke  wrote:

> Got each one working individually, but not multiples.  Is it possible?
> Please see attached files.
>
> Thanks!!! tika-data-config-2.xml
> <http://lucene.472066.n3.nabble.com/file/t494707/tika-data-config-2.xml>
> solrconfig.xml
> <http://lucene.472066.n3.nabble.com/file/t494707/solrconfig.xml>
>
>
>
> --
> Sent from: http://lucene.472066.n3.nabble.com/Solr-User-f472068.html
>


Re: Help with a DIH config file

2019-03-13 Thread wclarke
Got each one working individually, but not multiples.  Is it possible? 
Please see attached files.

Thanks!!! tika-data-config-2.xml
  
solrconfig.xml
  



--
Sent from: http://lucene.472066.n3.nabble.com/Solr-User-f472068.html


Re: Help with a DIH config file

2019-03-13 Thread wclarke
I didn't know I could do an updateProcessorChain and call it in the config
file.  I tried doing it in the solrconfig, but it just wouldn't take.  I
will try this though!  Thanks

The value is the file path in id/url.



--
Sent from: http://lucene.472066.n3.nabble.com/Solr-User-f472068.html


Re: Help with a DIH config file

2019-03-13 Thread wclarke
Absolutely!  I attached it to the original message, But I can post here too. 
I am VERY new to Solr and am winging it and while the documentation has been
a little helpful, I just need more complex examples.

tika-data-config-2.xml
  



--
Sent from: http://lucene.472066.n3.nabble.com/Solr-User-f472068.html


Re: Help with a DIH config file

2019-03-12 Thread Jörn Franke
Some addition: You can also strip HTML in DIH using the HTML Strip
transformer:
https://wiki.apache.org/solr/DataImportHandler#HTMLStripTransformer

In that way you can probably live without a UpdateRequestProcessorChain

On Tue, Mar 12, 2019 at 10:24 PM Jörn Franke  wrote:

> Would it be possible to share the DIH config file?
>
> I am not sure if I get all your points correctly.
>
> Ad 1) is this about a value in a field? Then use the regex transformer:
> https://wiki.apache.org/solr/DataImportHandler#RegexTransformer
> Alternatively, use a RegexReplaceProcessorFactoryin solrconfig.xml or a
> ScriptTransformer in DIH. E.g. a RegexReplaceProcessorFactory (
> https://lucene.apache.org/solr/7_3_0//solr-core/org/apache/solr/update/processor/RegexReplaceProcessorFactory.html)
> in a custom processing chain in solrconfig.xml
> 
>content
>\n|\t|\r
>
>true
>  
>   
>   
> 
>
> and attach it to your dih in solrconfig.xml
> 
> 
>   data-config.xml
> regex_replace
> 
> 
>
>
>
>
> ad 2) was this html part of the original document or is it "HTML"
> generated by Tika. In the first case then you can use a
> HTMLStripFieldUpdateProcessorFactory that should be configured in the
> solrconfig.xml:
> https://lucene.apache.org/solr/6_6_0//solr-core/org/apache/solr/update/processor/HTMLStripFieldUpdateProcessorFactory.html
> You need to create an update processor chain
> https://lucene.apache.org/solr/guide/7_3/update-request-processors.html#custom-update-request-processor-chain
>
>
> 
>   
> myfyfield
>   
>   
>   
> 
>
> and attach it to your dih in solrconfig.xml
> 
> 
>   data-config.xml
> remove_html
> 
> 
>
> In the second case (Tika attaches XML elements) specify
> extractFormat="text" for Tika in DIH :
> https://lucene.apache.org/solr/guide/6_6/uploading-data-with-solr-cell-using-apache-tika.html
>
> add 3) see 1)
>
> Note: You can only create one chain / DIH, so you need to put all the
> processors that you want to apply into one chain. The transformers are
> independent of the processors and are configured in the DIH.
>
>
>
> On Tue, Mar 12, 2019 at 7:47 PM wclarke  wrote:
>
>> I have a previous post that looks like this:
>>
>> I am pulling a large amount of data from a local source
>> D:\foo\resource\.  I
>> am using tika through a DIH to index the multiple file formats with text
>> and
>> metadata.  I have almost all the information being pulled that I want,
>> however, I am having a couple of issues:
>>
>> 1. I need to run a regex replace of the D:\foo\resource\ to be http://,
>> which is part of what I want to use XPath for.  I have the regex written,
>> but not the replacement and I am not sure of where it needs to be located
>> in
>> my data-config.xml file.
>>
>> 2. I want to strip html where necessary also using XPath.
>>
>> 3. I need to remove \n, \t, \r, and any other extra crap I am getting in
>> the
>> text field to just get to the text content of the document, whatever mime
>> type that might be so that it can be searchable.
>>
>> I am running it through the solr admin data import as opposed to the
>> post.jar (I have tried both).  And this is running on Windows and cannot
>> be
>> run on Linux as we have no one who can support it.  I am posting my
>> tika-data-config.xml (not tikaconfig) I named it this way so as not to be
>> confused with our db-config for our catalog pull.
>>
>> Thanks in advance for any help.  And I will upload any additional files
>> that
>> might be helpful upon request - I don't want to overload the post.
>>
>> We are a small non-profit without a great deal of money, however, if there
>> is someone who could finish writing it we would be willing to pay a little
>> something for time.  We really need this done ASAP!
>>
>>
>>
>> --
>> Sent from: http://lucene.472066.n3.nabble.com/Solr-User-f472068.html
>>
>


Re: Help with a DIH config file

2019-03-12 Thread Jörn Franke
Would it be possible to share the DIH config file?

I am not sure if I get all your points correctly.

Ad 1) is this about a value in a field? Then use the regex transformer:
https://wiki.apache.org/solr/DataImportHandler#RegexTransformer
Alternatively, use a RegexReplaceProcessorFactoryin solrconfig.xml or a
ScriptTransformer in DIH. E.g. a RegexReplaceProcessorFactory (
https://lucene.apache.org/solr/7_3_0//solr-core/org/apache/solr/update/processor/RegexReplaceProcessorFactory.html)
in a custom processing chain in solrconfig.xml

   content
   \n|\t|\r
   
   true
 
  
  


and attach it to your dih in solrconfig.xml


  data-config.xml
regex_replace






ad 2) was this html part of the original document or is it "HTML" generated
by Tika. In the first case then you can use a
HTMLStripFieldUpdateProcessorFactory that should be configured in the
solrconfig.xml:
https://lucene.apache.org/solr/6_6_0//solr-core/org/apache/solr/update/processor/HTMLStripFieldUpdateProcessorFactory.html
You need to create an update processor chain
https://lucene.apache.org/solr/guide/7_3/update-request-processors.html#custom-update-request-processor-chain



  
myfyfield
  
  
  


and attach it to your dih in solrconfig.xml


  data-config.xml
remove_html



In the second case (Tika attaches XML elements) specify
extractFormat="text" for Tika in DIH :
https://lucene.apache.org/solr/guide/6_6/uploading-data-with-solr-cell-using-apache-tika.html

add 3) see 1)

Note: You can only create one chain / DIH, so you need to put all the
processors that you want to apply into one chain. The transformers are
independent of the processors and are configured in the DIH.



On Tue, Mar 12, 2019 at 7:47 PM wclarke  wrote:

> I have a previous post that looks like this:
>
> I am pulling a large amount of data from a local source D:\foo\resource\.
> I
> am using tika through a DIH to index the multiple file formats with text
> and
> metadata.  I have almost all the information being pulled that I want,
> however, I am having a couple of issues:
>
> 1. I need to run a regex replace of the D:\foo\resource\ to be http://,
> which is part of what I want to use XPath for.  I have the regex written,
> but not the replacement and I am not sure of where it needs to be located
> in
> my data-config.xml file.
>
> 2. I want to strip html where necessary also using XPath.
>
> 3. I need to remove \n, \t, \r, and any other extra crap I am getting in
> the
> text field to just get to the text content of the document, whatever mime
> type that might be so that it can be searchable.
>
> I am running it through the solr admin data import as opposed to the
> post.jar (I have tried both).  And this is running on Windows and cannot be
> run on Linux as we have no one who can support it.  I am posting my
> tika-data-config.xml (not tikaconfig) I named it this way so as not to be
> confused with our db-config for our catalog pull.
>
> Thanks in advance for any help.  And I will upload any additional files
> that
> might be helpful upon request - I don't want to overload the post.
>
> We are a small non-profit without a great deal of money, however, if there
> is someone who could finish writing it we would be willing to pay a little
> something for time.  We really need this done ASAP!
>
>
>
> --
> Sent from: http://lucene.472066.n3.nabble.com/Solr-User-f472068.html
>


<    1   2   3   4   5   6   7   8   9   10   >