Re: analyzer for _text_ field

2016-07-15 Thread Waldyr Neto
tks a lot, i'll try soon and give u a feed back :)

On Fri, Jul 15, 2016 at 4:07 PM, David Santamauro <
david.santama...@gmail.com> wrote:

>
> The opening and closing single quotes don't match
>
> -data-binary '{ ... }’
>
> it should be:
>
> -data-binary '{ ... }'
>
>
>
> On 07/15/2016 02:59 PM, Steve Rowe wrote:
>
>> Waldyr, maybe it got mangled by my email client or yours?
>>
>> Here’s the same command:
>>
>>
>>
>> --
>> Steve
>> www.lucidworks.com
>>
>> On Jul 15, 2016, at 2:16 PM, Waldyr Neto  wrote:
>>>
>>> Hy Steves, tks for the help
>>> unfortunately i'm making some mistake
>>>
>>> when i try to run
>>>

> curl -X POST -H 'Content-type: application/json’ \
>>> http://localhost:8983/solr/gettingstarted/schema --data-binary
>>> '{"add-field-type": { "name": "my_new_field_type", "class":
>>> "solr.TextField","analyzer": {"charFilters": [{"class":
>>> "solr.HTMLStripCharFilterFactory"}], "tokenizer": {"class":
>>> "solr.StandardTokenizerFactory"},"filters":[{"class":
>>> "solr.WordDelimiterFilterFactory"}, {"class":
>>> "solr.LowerCaseFilterFactory"}]}},"replace-field": { "name":
>>> "_text_","type": "my_new_field_type", "multiValued": "true","indexed":
>>> "true","stored": "false"}}’
>>>
>>> i receave the folow error msg from curl program
>>> :
>>>
>>> curl: (3) [globbing] unmatched brace in column 1
>>>
>>> curl: (6) Could not resolve host: name
>>>
>>> curl: (6) Could not resolve host: my_new_field_type,
>>>
>>> curl: (6) Could not resolve host: class
>>>
>>> curl: (6) Could not resolve host: solr.TextField,analyzer
>>>
>>> curl: (3) [globbing] unmatched brace in column 1
>>>
>>> curl: (3) [globbing] bad range specification in column 2
>>>
>>> curl: (3) [globbing] unmatched close brace/bracket in column 32
>>>
>>> curl: (6) Could not resolve host: tokenizer
>>>
>>> curl: (3) [globbing] unmatched brace in column 1
>>>
>>> curl: (3) [globbing] unmatched close brace/bracket in column 30
>>>
>>> curl: (3) [globbing] unmatched close brace/bracket in column 32
>>>
>>> curl: (3) [globbing] unmatched brace in column 1
>>>
>>> curl: (3) [globbing] unmatched close brace/bracket in column 28
>>>
>>> curl: (3) [globbing] unmatched brace in column 1
>>>
>>> curl: (6) Could not resolve host: name
>>>
>>> curl: (6) Could not resolve host: _text_,type
>>>
>>> curl: (6) Could not resolve host: my_new_field_type,
>>>
>>> curl: (6) Could not resolve host: multiValued
>>>
>>> curl: (6) Could not resolve host: true,indexed
>>>
>>> curl: (6) Could not resolve host: true,stored
>>>
>>> curl: (3) [globbing] unmatched close brace/bracket in column 6
>>>
>>> cvs1:~ vvisionphp1$
>>>
>>> On Fri, Jul 15, 2016 at 2:45 PM, Steve Rowe  wrote:
>>>
>>> Hi Waldyr,

 An example of changing the _text_ analyzer by first creating a new field
 type, and then changing the _text_ field to use the new field type
 (after
 starting Solr 6.1 with “bin/solr start -e schemaless”):

 -
 PROMPT$ curl -X POST -H 'Content-type: application/json’ \
 http://localhost:8983/solr/gettingstarted/schema --data-binary '{
   "add-field-type": {
 "name": "my_new_field_type",
 "class": "solr.TextField",
 "analyzer": {
   "charFilters": [{
 "class": "solr.HTMLStripCharFilterFactory"
   }],
   "tokenizer": {
 "class": "solr.StandardTokenizerFactory"
   },
   "filters":[{
   "class": "solr.WordDelimiterFilterFactory"
 }, {
   "class": "solr.LowerCaseFilterFactory"
   }]}},
   "replace-field": {
 "name": "_text_",
 "type": "my_new_field_type",
 "multiValued": "true",
 "indexed": "true",
 "stored": "false"
   }}’
 -

 PROMPT$ curl
 http://localhost:8983/solr/gettingstarted/schema/fields/_text_

 -
 {
   "responseHeader”:{ […] },
   "field":{
 "name":"_text_",
 "type":"my_new_field_type",
 "multiValued":true,
 "indexed":true,
 "stored":false}}
 -

 --
 Steve
 www.lucidworks.com

 On Jul 15, 2016, at 12:54 PM, Waldyr Neto  wrote:
>
> Hy, How can i configure the analyzer for the _text_ field?
>



>>


Re: Multilevel grouping?

2016-07-15 Thread William Bell
On grouping, if I only want the first 5 responses, it would be great if the
code short-circuited to improve performance.

I am not sure I want it grouping 10M results, when I already have 5 that
are good enough. ?

On Thu, Jul 14, 2016 at 10:33 AM, Callum Lamb  wrote:

> Look at the collapse module
>
> https://cwiki.apache.org/confluence/display/solr/Collapse+and+Expand+Results
> .
> It can the same thing as group.
>
> If you want to get counts/facets from before the collapse, tag the collapse
> statement use the exclude tags tags in your json facets (there's an
> equivalent for non json facets). I think the default nullpolicy is
> different from grouping too, but you can change it to be the same.
>
> I've not been able to get 2 collapses to work on my version of Solr. But
> collapse + group works and you can get 2 levels. Not being able to do
> multiple collapses appears to be a bug (it sorta works). I recall there
> being JIRA case somewhere stating it was fixed in some version. So you may
> be able to do as many levels as you like if you upgrade/already run a very
> recent version of Solr.
>
>
>
>
>
> On Thu, Jul 14, 2016 at 3:52 PM, Aditya Sundaram <
> aditya.sunda...@myntra.com
> > wrote:
>
> > Thanks Yonik, was looking for exactly that, is there any workaround to
> > achieve that currently?
> >
> > On Tue, Jul 12, 2016 at 5:07 PM, Yonik Seeley  wrote:
> >
> > > I started this a while ago, but haven't found the time to finish:
> > > https://issues.apache.org/jira/browse/SOLR-7830
> > >
> > > -Yonik
> > >
> > >
> > > On Tue, Jul 12, 2016 at 7:29 AM, Aditya Sundaram
> > >  wrote:
> > > > Does solr support multilevel grouping? I want to group upto 2/3
> levels
> > > > based on different fields i.e 1st group on field one, within which i
> > > group
> > > > by field 2 etc.
> > > > I am aware of facet.pivot which does the same but retrieves only the
> > > count.
> > > > Is there anyway to get the documents as well along with the count in
> > > > facet.pivot???
> > > >
> > > > --
> > > > Aditya Sundaram
> > >
> >
> >
> >
> > --
> > Aditya Sundaram
> > Software Engineer, Technology team
> > AKR Tech park B Block, B1 047
> > +91-9844006866
> >
>
> --
>
> Mintel Group Ltd | 11 Pilgrim Street | London | EC4V 6RN
> Registered in England: Number 1475918. | VAT Number: GB 232 9342 72
>
> Contact details for our other offices can be found at
> http://www.mintel.com/office-locations.
>
> This email and any attachments may include content that is confidential,
> privileged
> or otherwise protected under applicable law. Unauthorised disclosure,
> copying, distribution
> or use of the contents is prohibited and may be unlawful. If you have
> received this email in error,
> including without appropriate authorisation, then please reply to the
> sender about the error
> and delete this email and any attachments.
>
>


-- 
Bill Bell
billnb...@gmail.com
cell 720-256-8076


Adding solr to existing web application

2016-07-15 Thread Natarajan, Rajeswari
Hi,

We have a springboot application and we would like to add Solr to it as the 
same process as the springboot application. Tried to add SolrDispatchFilter in 
the springboot. Could add the filter successfully ,but then the solr admin 
panel is not reachable ,there are no errors. We use solr 6.1.

Has anyone done this successfully. If yes ,can you please share the steps.

Thank you,
Rajeswari


SOLR-8297 - Join on sharded collections

2016-07-15 Thread Shikha Somani
Hi,


Solr Join on sharded from collections was supported in Solr 4.x but broke in 
Solr 5.x (8297). This issue is 
blocking issue as it directly impacts search functionality.


A fix for this ?issue is committed and ready for merge. Please review its PR 
and let me know your thoughts on it.

Appreciate your quick response on this (These changes are nearly a month old).

PR: https://github.com/apache/lucene-solr/pull/35


Thanks,
Shikha









NOTE: This message may contain information that is confidential, proprietary, 
privileged or otherwise protected by law. The message is intended solely for 
the named addressee. If received in error, please destroy and notify the 
sender. Any use of this email is prohibited when received in error. Impetus 
does not represent, warrant and/or guarantee, that the integrity of this 
communication has been maintained nor that the communication is free of errors, 
virus, interception or interference.


Re: index sql databases

2016-07-15 Thread kostali hassan
Thank you Shawn the prb is when I display the field type date I get the
value in this forme -MM-dd'T'hh:mm:ss'Z'


Re: index sql databases

2016-07-15 Thread Shawn Heisey
On 7/15/2016 12:42 PM, kostali hassan wrote:
> I use solr5.4.1 when a attribute the type date is null (:00:00)
> the processus of indexation stop and the log had an Error , how i have
> to change in driver="com.mysql.jdbc.Driver" to ignore null date; Last
> question how to set  dateTimeFormat="-MM-dd'T'hh:mm:ss'Z'" /> to to field
> date and time< hh:mm:ss>

An all-zero date is NOT a null date.  This is a valid date for MySQL,
but it is not a valid date for Solr, and Solr will complain about it.

You need to add a parameter to your JDBC URL to set zero dates to null,
so the field will not be present in the Solr document.  I am using this
parameter in my URL definition:

   
url="jdbc:mysql://${dih.request.dbHost}:3306/${dih.request.dbSchema}?zeroDateTimeBehavior=convertToNull"

I have no idea what you are trying to do with that  tag.  If the
MySQL field is a date/time field, JDBC will typically put it into the
correct format for Solr date field types.  About the only thing that
mightgo wrong is timezones -- you may need to explicitly set the
timezone with MySQL url parameters.

Thanks,
Shawn



Re: Reusing `geodist()` value in a function query

2016-07-15 Thread Aakash Sabharwal
Hey!

Any help on this would be appreciated.
I tried tracing through the code and I couldn't find evidence of any level
of geodist() caching.

Aakash

On Thu, Jul 14, 2016 at 7:33 PM Aakash Sabharwal 
wrote:

> Alternatively if I do the following:
>
> *q={!boost b=max(sub(abs(sub($geodz,0.0)),1.6),
> pow(abs(sub($geodz,0.0)),1.2))}=geodist()*
>
> Does this lead to geodist() being recomputed or not?
>
>
>
>
>
> On Thu, Jul 14, 2016 at 6:37 PM Aakash Sabharwal <
> aakashsabhar...@gmail.com> wrote:
>
>> Hello,
>>
>> geodist calculations can be expensive so trying ways to optimize around
>> it.
>> Particularly trying to figure out if sub-function queries are cached
>> while calculating an overall boost function for queries.
>>
>> For example if I query with the below:
>>
>>
>> *q={!boost b=max(sub(abs(sub(geodist(),0.0)),1.6),
>> pow(abs(sub(geodist(),0.0)),1.2))}*
>>
>> Will geodist() be invoked twice or will it be re-used?
>> I believe function queries do get converted to FunctionValues which is a
>> ValueSource. Are those always cached as DocValues or is that only in case
>> of field values?
>>
>> Aakash
>>
>


Re: using lucene parser syntax with eDisMax

2016-07-15 Thread Erick Erickson
Yes on both counts. Although it takes a bit of practice, if you add
=query to the query you'll see a section of the
response showing you exactly what the resulting query is after
all the rules are applied.

Best,
Erick

On Fri, Jul 15, 2016 at 12:32 PM, Whelan, Andy  wrote:
> Hello,
>
> I am using the eDisMax parser and have the following question.
> With the eDisMax parser we can pass a query, q="brown and mazda",  and 
> configure a bunch of fields in a solrconfig.xml SearchHandler to query on as 
> "qf". Let's say I have a SOLR schema.xml with the following fields:
> 
> 
>
> and the following request handler in solrconfig.xml:
> 
> 
> edismax
> color brand
>  
> 
>
> This makes boosting very easy.  I can execute a query "q=brown^2.0 and 
> mazda^3.0") against the query handler "/select" above without specifying 
> fields in the query string.  I can do this without having to copy color and 
> brand to a specific catch all field as I would with the "lucene" parser 
> (which would be configured as the default field "df").
> The documentation at 
> https://cwiki.apache.org/confluence/display/solr/The+Extended+DisMax+Query+Parser
>  says that eDisMax "supports the full Lucene query parser syntax".
> Does this mean that a query string "color:brown^2 and mazda" is legal with 
> eDisMax too?  Notice that I am specifying the color field in the query 
> (lucene parser syntax). If the answer is yes, does this mean that "brown" is 
> only filtered against the color field and mazda will be filtered against both 
> the color field and the brand field?
> Thanks!
>


Indexing documents stored in HDFS

2016-07-15 Thread Rishabh Patel
Hello,

I am trying to find a way to index some documents, all located in a
directory in HDFS.

Since HDFS has a REST API, I was trying to use the DataImportHandler(DIH)
along with the datasource type as URLDataSource, to index the documents.

Is this approach wrong? If so, then is there a canonical way to index
documents present in HDFS?
-- 
Sincerely,
*Rishabh Patel*


using lucene parser syntax with eDisMax

2016-07-15 Thread Whelan, Andy
Hello,

I am using the eDisMax parser and have the following question.
With the eDisMax parser we can pass a query, q="brown and mazda",  and 
configure a bunch of fields in a solrconfig.xml SearchHandler to query on as 
"qf". Let's say I have a SOLR schema.xml with the following fields:



and the following request handler in solrconfig.xml:


edismax
color brand
 


This makes boosting very easy.  I can execute a query "q=brown^2.0 and 
mazda^3.0") against the query handler "/select" above without specifying fields 
in the query string.  I can do this without having to copy color and brand to a 
specific catch all field as I would with the "lucene" parser (which would be 
configured as the default field "df").
The documentation at 
https://cwiki.apache.org/confluence/display/solr/The+Extended+DisMax+Query+Parser
 says that eDisMax "supports the full Lucene query parser syntax".
Does this mean that a query string "color:brown^2 and mazda" is legal with 
eDisMax too?  Notice that I am specifying the color field in the query (lucene 
parser syntax). If the answer is yes, does this mean that "brown" is only 
filtered against the color field and mazda will be filtered against both the 
color field and the brand field?
Thanks!



Re: Solr Cloud setup and deoployment in Azure cloud

2016-07-15 Thread Aniket Khare
Just wanted to add few more details.
I want to create 7 VM set where 4 servers will be used for solrcloud setup
and 3 VM as zokkeper instance.

On Thu, Jul 14, 2016 at 4:15 PM, Aniket Khare 
wrote:

> Hi,
>
> I was looking for option to deploy SolrCloud in Azure. I am using ARM
> templetes for the deployment.
> Please refer the following link for ARM(Azure resource manager) templete.
> https://github.com/Azure/azure-quickstart-templates
>
> https://azure.microsoft.com/en-us/documentation/articles/resource-group-authoring-templates/
>
> Could you please let me know if there is any available ARM templete or
> anyone have done similar set of deployment.
>
>
> --
> Regards,
>
> Aniket S. Khare
>



-- 
Regards,

Aniket S. Khare


Re: analyzer for _text_ field

2016-07-15 Thread David Santamauro


The opening and closing single quotes don't match

-data-binary '{ ... }’

it should be:

-data-binary '{ ... }'


On 07/15/2016 02:59 PM, Steve Rowe wrote:

Waldyr, maybe it got mangled by my email client or yours?

Here’s the same command:

   

--
Steve
www.lucidworks.com


On Jul 15, 2016, at 2:16 PM, Waldyr Neto  wrote:

Hy Steves, tks for the help
unfortunately i'm making some mistake

when i try to run



curl -X POST -H 'Content-type: application/json’ \
http://localhost:8983/solr/gettingstarted/schema --data-binary
'{"add-field-type": { "name": "my_new_field_type", "class":
"solr.TextField","analyzer": {"charFilters": [{"class":
"solr.HTMLStripCharFilterFactory"}], "tokenizer": {"class":
"solr.StandardTokenizerFactory"},"filters":[{"class":
"solr.WordDelimiterFilterFactory"}, {"class":
"solr.LowerCaseFilterFactory"}]}},"replace-field": { "name":
"_text_","type": "my_new_field_type", "multiValued": "true","indexed":
"true","stored": "false"}}’

i receave the folow error msg from curl program
:

curl: (3) [globbing] unmatched brace in column 1

curl: (6) Could not resolve host: name

curl: (6) Could not resolve host: my_new_field_type,

curl: (6) Could not resolve host: class

curl: (6) Could not resolve host: solr.TextField,analyzer

curl: (3) [globbing] unmatched brace in column 1

curl: (3) [globbing] bad range specification in column 2

curl: (3) [globbing] unmatched close brace/bracket in column 32

curl: (6) Could not resolve host: tokenizer

curl: (3) [globbing] unmatched brace in column 1

curl: (3) [globbing] unmatched close brace/bracket in column 30

curl: (3) [globbing] unmatched close brace/bracket in column 32

curl: (3) [globbing] unmatched brace in column 1

curl: (3) [globbing] unmatched close brace/bracket in column 28

curl: (3) [globbing] unmatched brace in column 1

curl: (6) Could not resolve host: name

curl: (6) Could not resolve host: _text_,type

curl: (6) Could not resolve host: my_new_field_type,

curl: (6) Could not resolve host: multiValued

curl: (6) Could not resolve host: true,indexed

curl: (6) Could not resolve host: true,stored

curl: (3) [globbing] unmatched close brace/bracket in column 6

cvs1:~ vvisionphp1$

On Fri, Jul 15, 2016 at 2:45 PM, Steve Rowe  wrote:


Hi Waldyr,

An example of changing the _text_ analyzer by first creating a new field
type, and then changing the _text_ field to use the new field type (after
starting Solr 6.1 with “bin/solr start -e schemaless”):

-
PROMPT$ curl -X POST -H 'Content-type: application/json’ \
http://localhost:8983/solr/gettingstarted/schema --data-binary '{
  "add-field-type": {
"name": "my_new_field_type",
"class": "solr.TextField",
"analyzer": {
  "charFilters": [{
"class": "solr.HTMLStripCharFilterFactory"
  }],
  "tokenizer": {
"class": "solr.StandardTokenizerFactory"
  },
  "filters":[{
  "class": "solr.WordDelimiterFilterFactory"
}, {
  "class": "solr.LowerCaseFilterFactory"
  }]}},
  "replace-field": {
"name": "_text_",
"type": "my_new_field_type",
"multiValued": "true",
"indexed": "true",
"stored": "false"
  }}’
-

PROMPT$ curl
http://localhost:8983/solr/gettingstarted/schema/fields/_text_

-
{
  "responseHeader”:{ […] },
  "field":{
"name":"_text_",
"type":"my_new_field_type",
"multiValued":true,
"indexed":true,
"stored":false}}
-

--
Steve
www.lucidworks.com


On Jul 15, 2016, at 12:54 PM, Waldyr Neto  wrote:

Hy, How can i configure the analyzer for the _text_ field?







Re: analyzer for _text_ field

2016-07-15 Thread Steve Rowe
Waldyr, maybe it got mangled by my email client or yours?  

Here’s the same command:

  

--
Steve
www.lucidworks.com

> On Jul 15, 2016, at 2:16 PM, Waldyr Neto  wrote:
> 
> Hy Steves, tks for the help
> unfortunately i'm making some mistake
> 
> when i try to run
>>> 
> curl -X POST -H 'Content-type: application/json’ \
> http://localhost:8983/solr/gettingstarted/schema --data-binary
> '{"add-field-type": { "name": "my_new_field_type", "class":
> "solr.TextField","analyzer": {"charFilters": [{"class":
> "solr.HTMLStripCharFilterFactory"}], "tokenizer": {"class":
> "solr.StandardTokenizerFactory"},"filters":[{"class":
> "solr.WordDelimiterFilterFactory"}, {"class":
> "solr.LowerCaseFilterFactory"}]}},"replace-field": { "name":
> "_text_","type": "my_new_field_type", "multiValued": "true","indexed":
> "true","stored": "false"}}’
> 
> i receave the folow error msg from curl program
> :
> 
> curl: (3) [globbing] unmatched brace in column 1
> 
> curl: (6) Could not resolve host: name
> 
> curl: (6) Could not resolve host: my_new_field_type,
> 
> curl: (6) Could not resolve host: class
> 
> curl: (6) Could not resolve host: solr.TextField,analyzer
> 
> curl: (3) [globbing] unmatched brace in column 1
> 
> curl: (3) [globbing] bad range specification in column 2
> 
> curl: (3) [globbing] unmatched close brace/bracket in column 32
> 
> curl: (6) Could not resolve host: tokenizer
> 
> curl: (3) [globbing] unmatched brace in column 1
> 
> curl: (3) [globbing] unmatched close brace/bracket in column 30
> 
> curl: (3) [globbing] unmatched close brace/bracket in column 32
> 
> curl: (3) [globbing] unmatched brace in column 1
> 
> curl: (3) [globbing] unmatched close brace/bracket in column 28
> 
> curl: (3) [globbing] unmatched brace in column 1
> 
> curl: (6) Could not resolve host: name
> 
> curl: (6) Could not resolve host: _text_,type
> 
> curl: (6) Could not resolve host: my_new_field_type,
> 
> curl: (6) Could not resolve host: multiValued
> 
> curl: (6) Could not resolve host: true,indexed
> 
> curl: (6) Could not resolve host: true,stored
> 
> curl: (3) [globbing] unmatched close brace/bracket in column 6
> 
> cvs1:~ vvisionphp1$
> 
> On Fri, Jul 15, 2016 at 2:45 PM, Steve Rowe  wrote:
> 
>> Hi Waldyr,
>> 
>> An example of changing the _text_ analyzer by first creating a new field
>> type, and then changing the _text_ field to use the new field type (after
>> starting Solr 6.1 with “bin/solr start -e schemaless”):
>> 
>> -
>> PROMPT$ curl -X POST -H 'Content-type: application/json’ \
>>http://localhost:8983/solr/gettingstarted/schema --data-binary '{
>>  "add-field-type": {
>>"name": "my_new_field_type",
>>"class": "solr.TextField",
>>"analyzer": {
>>  "charFilters": [{
>>"class": "solr.HTMLStripCharFilterFactory"
>>  }],
>>  "tokenizer": {
>>"class": "solr.StandardTokenizerFactory"
>>  },
>>  "filters":[{
>>  "class": "solr.WordDelimiterFilterFactory"
>>}, {
>>  "class": "solr.LowerCaseFilterFactory"
>>  }]}},
>>  "replace-field": {
>>"name": "_text_",
>>"type": "my_new_field_type",
>>"multiValued": "true",
>>"indexed": "true",
>>"stored": "false"
>>  }}’
>> -
>> 
>> PROMPT$ curl
>> http://localhost:8983/solr/gettingstarted/schema/fields/_text_
>> 
>> -
>> {
>>  "responseHeader”:{ […] },
>>  "field":{
>>"name":"_text_",
>>"type":"my_new_field_type",
>>"multiValued":true,
>>"indexed":true,
>>"stored":false}}
>> -
>> 
>> --
>> Steve
>> www.lucidworks.com
>> 
>>> On Jul 15, 2016, at 12:54 PM, Waldyr Neto  wrote:
>>> 
>>> Hy, How can i configure the analyzer for the _text_ field?
>> 
>> 



index sql databases

2016-07-15 Thread kostali hassan
I use solr5.4.1 when a attribute the type date is null (:00:00) the
processus of indexation stop and the log had an Error , how i have to
change in driver="com.mysql.jdbc.Driver" to ignore null date;
Last question how to set


to to field date and time< hh:mm:ss>


Re: analyzer for _text_ field

2016-07-15 Thread Waldyr Neto
Hy Steves, tks for the help
unfortunately i'm making some mistake

when i try to run
>>
curl -X POST -H 'Content-type: application/json’ \
http://localhost:8983/solr/gettingstarted/schema --data-binary
'{"add-field-type": { "name": "my_new_field_type", "class":
"solr.TextField","analyzer": {"charFilters": [{"class":
"solr.HTMLStripCharFilterFactory"}], "tokenizer": {"class":
"solr.StandardTokenizerFactory"},"filters":[{"class":
"solr.WordDelimiterFilterFactory"}, {"class":
"solr.LowerCaseFilterFactory"}]}},"replace-field": { "name":
"_text_","type": "my_new_field_type", "multiValued": "true","indexed":
"true","stored": "false"}}’

i receave the folow error msg from curl program
:

curl: (3) [globbing] unmatched brace in column 1

curl: (6) Could not resolve host: name

curl: (6) Could not resolve host: my_new_field_type,

curl: (6) Could not resolve host: class

curl: (6) Could not resolve host: solr.TextField,analyzer

curl: (3) [globbing] unmatched brace in column 1

curl: (3) [globbing] bad range specification in column 2

curl: (3) [globbing] unmatched close brace/bracket in column 32

curl: (6) Could not resolve host: tokenizer

curl: (3) [globbing] unmatched brace in column 1

curl: (3) [globbing] unmatched close brace/bracket in column 30

curl: (3) [globbing] unmatched close brace/bracket in column 32

curl: (3) [globbing] unmatched brace in column 1

curl: (3) [globbing] unmatched close brace/bracket in column 28

curl: (3) [globbing] unmatched brace in column 1

curl: (6) Could not resolve host: name

curl: (6) Could not resolve host: _text_,type

curl: (6) Could not resolve host: my_new_field_type,

curl: (6) Could not resolve host: multiValued

curl: (6) Could not resolve host: true,indexed

curl: (6) Could not resolve host: true,stored

curl: (3) [globbing] unmatched close brace/bracket in column 6

cvs1:~ vvisionphp1$

On Fri, Jul 15, 2016 at 2:45 PM, Steve Rowe  wrote:

> Hi Waldyr,
>
> An example of changing the _text_ analyzer by first creating a new field
> type, and then changing the _text_ field to use the new field type (after
> starting Solr 6.1 with “bin/solr start -e schemaless”):
>
> -
> PROMPT$ curl -X POST -H 'Content-type: application/json’ \
> http://localhost:8983/solr/gettingstarted/schema --data-binary '{
>   "add-field-type": {
> "name": "my_new_field_type",
> "class": "solr.TextField",
> "analyzer": {
>   "charFilters": [{
> "class": "solr.HTMLStripCharFilterFactory"
>   }],
>   "tokenizer": {
> "class": "solr.StandardTokenizerFactory"
>   },
>   "filters":[{
>   "class": "solr.WordDelimiterFilterFactory"
> }, {
>   "class": "solr.LowerCaseFilterFactory"
>   }]}},
>   "replace-field": {
> "name": "_text_",
> "type": "my_new_field_type",
> "multiValued": "true",
> "indexed": "true",
> "stored": "false"
>   }}’
> -
>
> PROMPT$ curl
> http://localhost:8983/solr/gettingstarted/schema/fields/_text_
>
> -
> {
>   "responseHeader”:{ […] },
>   "field":{
> "name":"_text_",
> "type":"my_new_field_type",
> "multiValued":true,
> "indexed":true,
> "stored":false}}
> -
>
> --
> Steve
> www.lucidworks.com
>
> > On Jul 15, 2016, at 12:54 PM, Waldyr Neto  wrote:
> >
> > Hy, How can i configure the analyzer for the _text_ field?
>
>


Re: analyzer for _text_ field

2016-07-15 Thread Steve Rowe
Hi Waldyr,

An example of changing the _text_ analyzer by first creating a new field type, 
and then changing the _text_ field to use the new field type (after starting 
Solr 6.1 with “bin/solr start -e schemaless”):

-
PROMPT$ curl -X POST -H 'Content-type: application/json’ \
http://localhost:8983/solr/gettingstarted/schema --data-binary '{
  "add-field-type": {
"name": "my_new_field_type",
"class": "solr.TextField",
"analyzer": {
  "charFilters": [{
"class": "solr.HTMLStripCharFilterFactory"
  }],
  "tokenizer": {
"class": "solr.StandardTokenizerFactory"
  },
  "filters":[{
  "class": "solr.WordDelimiterFilterFactory"
}, {
  "class": "solr.LowerCaseFilterFactory"
  }]}},
  "replace-field": {
"name": "_text_",
"type": "my_new_field_type",
"multiValued": "true",
"indexed": "true",
"stored": "false"
  }}’
-

PROMPT$ curl http://localhost:8983/solr/gettingstarted/schema/fields/_text_

-
{
  "responseHeader”:{ […] },
  "field":{
"name":"_text_",
"type":"my_new_field_type",
"multiValued":true,
"indexed":true,
"stored":false}}
-

--
Steve
www.lucidworks.com

> On Jul 15, 2016, at 12:54 PM, Waldyr Neto  wrote:
> 
> Hy, How can i configure the analyzer for the _text_ field?



RE: [Non-DoD Source] Re: SimplePostTool error (UNCLASSIFIED)

2016-07-15 Thread Musshorn, Kris T CTR USARMY RDECOM ARL (US)
CLASSIFICATION: UNCLASSIFIED

Thanks Yonik and Eric,

If I set -filetypes 
csv,pdf,doc,docx,ppt,pptx,xls,xlsx,odt,odp,ods,ott,otp,rtf,htm,html,txt would 
this prevent indexing of xml files? 

Why does the simple post tool index .cfm files with this or default settings?

Thanks,
Kris

~~
Kris T. Musshorn
FileMaker Developer - Contractor – Catapult Technology Inc.  
US Army Research Lab 
Aberdeen Proving Ground 
Application Management & Development Branch 
410-278-7251
kris.t.musshorn@mail.mil
~~


-Original Message-
From: Erick Erickson [mailto:erickerick...@gmail.com] 
Sent: Friday, July 15, 2016 12:30 PM
To: solr-user 
Subject: [Non-DoD Source] Re: SimplePostTool error (UNCLASSIFIED)

simplePostTool is just that, simple. It's intended to get you started.
It is not a full-featured web crawler. As such, if you're encountering wonky 
web pages that are not well formed HTML there's no guarantee that it'll handle 
them gracefully.

Crawling websites is a pain, so if you require something robust I'd investigate 
Nutch (which integrates with Solr/Lucene) or similar.

Best,
Erick

On Fri, Jul 15, 2016 at 9:01 AM, Musshorn, Kris T CTR USARMY RDECOM ARL (US) 
 wrote:
> CLASSIFICATION: UNCLASSIFIED
>
> How do I correct this error when running the simple post tool against a 
> website?
> The tool successfully indexed for about 30 mins before throwing this error 
> and terminating.
>
> [Fatal Error] :642:15: XML document structures must start and end within the 
> same entity.
> Exception in thread "main" java.lang.RuntimeException: 
> org.xml.sax.SAXParseException; lineNumber: 642; columnNumber: 15; XML 
> document structures must start and end within the same entity.
> at 
> org.apache.solr.util.SimplePostTool$PageFetcher.getLinksFromWebPage(SimplePostTool.java:1219)
> at 
> org.apache.solr.util.SimplePostTool.webCrawl(SimplePostTool.java:601)
> at 
> org.apache.solr.util.SimplePostTool.webCrawl(SimplePostTool.java:618)
> at 
> org.apache.solr.util.SimplePostTool.webCrawl(SimplePostTool.java:618)
> at 
> org.apache.solr.util.SimplePostTool.webCrawl(SimplePostTool.java:618)
> at 
> org.apache.solr.util.SimplePostTool.webCrawl(SimplePostTool.java:618)
> at 
> org.apache.solr.util.SimplePostTool.postWebPages(SimplePostTool.java:548)
> at 
> org.apache.solr.util.SimplePostTool.doWebMode(SimplePostTool.java:351)
> at 
> org.apache.solr.util.SimplePostTool.execute(SimplePostTool.java:182)
> at 
> org.apache.solr.util.SimplePostTool.main(SimplePostTool.java:167)
> Caused by: org.xml.sax.SAXParseException; lineNumber: 642; columnNumber: 15; 
> XML document structures must start and end within the same entity.
> at 
> com.sun.org.apache.xerces.internal.parsers.DOMParser.parse(DOMParser.java:257)
> at 
> com.sun.org.apache.xerces.internal.jaxp.DocumentBuilderImpl.parse(DocumentBuilderImpl.java:339)
> at javax.xml.parsers.DocumentBuilder.parse(DocumentBuilder.java:121)
> at 
> org.apache.solr.util.SimplePostTool.makeDom(SimplePostTool.java:1028)
> at 
> org.apache.solr.util.SimplePostTool$PageFetcher.getLinksFromWebPage(SimplePostTool.java:1201)
> ... 9 more
>
> Thanks,
> Kris
>
> ~~
> Kris T. Musshorn
> FileMaker Developer - Contractor - Catapult Technology Inc.
> US Army Research Lab
> Aberdeen Proving Ground
> Application Management & Development Branch
> 410-278-7251
> kris.t.musshorn@mail.mil
> ~~
>
>
>
> CLASSIFICATION: UNCLASSIFIED


CLASSIFICATION: UNCLASSIFIED


analyzer for _text_ field

2016-07-15 Thread Waldyr Neto
Hy, How can i configure the analyzer for the _text_ field?


Re: SimplePostTool error (UNCLASSIFIED)

2016-07-15 Thread Yonik Seeley
On Fri, Jul 15, 2016 at 12:29 PM, Erick Erickson
 wrote:
> simplePostTool is just that, simple. It's intended to get you started.
> It is not a full-featured web crawler. As such, if you're encountering
> wonky web pages that are not well formed HTML there's no guarantee
> that it'll handle them gracefully.

HTML is not well formed XML though.  Hopefully we're not using an XML
parser to try and parse HTML?
The error message "XML document structures must start and end within
the same entity." is true for XML, but not for HTML.

-Yonik


Re: A working example to play with Naive Bayes classifier

2016-07-15 Thread Alessandro Benedetti
But how big it is your index ? Are you expecting Solr to automatically
classify your documents without any knowledge groundbase ?
Please attach an example of schema.
There was a reason if I asked you :)
Seems related the fact we get no token from the text analysis.

Cheers

On Fri, Jul 15, 2016 at 12:11 PM, Tomas Ramanauskas <
tomas.ramanaus...@springer.com> wrote:

> Hi, Allesandro,
>
> sorry for the delay. What do you mean?
>
>
> As I mentioned earlier, I followed a super simply set of steps.
>
> 1. Download Solr
> 2. Configure classification
> 3. Create some documents using curl over HTTP.
>
>
> Is it difficult to reproduce the steps / problem?
>
>
> Tomas
>
>
>
> > On 23 Jun 2016, at 16:42, Alessandro Benedetti <
> benedetti.ale...@gmail.com> wrote:
> >
> > Can you give an example of your schema, and can you run a simple query
> for
> > you index, curious to see how the input fields are analyzed.
> >
> > Cheers
> >
> > On Wed, Jun 22, 2016 at 6:05 PM, Alessandro Benedetti <
> > benedetti.ale...@gmail.com> wrote:
> >
> >> This is better!  At list the classifier is invoked!
> >> How many docs in the index have the class assigned?
> >> Take a look to the stacktrace and you should find the cause!
> >> I am now on mobile, I will check the code tomorrow!
> >> Cheers
> >> On 22 Jun 2016 5:26 pm, "Tomas Ramanauskas" <
> >> tomas.ramanaus...@springer.com> wrote:
> >>
> >>>
> >>> I also tried with this config (adding **):
> >>>
> >>>
> >>>  
> >>>
> >>>  classification
> >>>
> >>>  
> >>>
> >>>
> >>>
> >>>
> >>>
> >>> And I get the error:
> >>>
> >>>
> >>>
> >>> $ curl http://localhost:8983/solr/demo/update -d '
> >>> [
> >>> {"id" : "book15",
> >>> "title_t":["The Way of Kings"],
> >>> "author_s":"Brandon Sanderson",
> >>> "cat_s": null,
> >>> "pubyear_i":2010,
> >>> "ISBN_s":"978-0-7653-2635-5"
> >>> }
> >>> ]'
> >>>
> {"responseHeader":{"status":500,"QTime":29},"error":{"trace":"java.lang.NullPointerException\n\tat
> >>>
> org.apache.lucene.classification.document.SimpleNaiveBayesDocumentClassifier.getTokenArray(SimpleNaiveBayesDocumentClassifier.java:202)\n\tat
> >>>
> org.apache.lucene.classification.document.SimpleNaiveBayesDocumentClassifier.analyzeSeedDocument(SimpleNaiveBayesDocumentClassifier.java:162)\n\tat
> >>>
> org.apache.lucene.classification.document.SimpleNaiveBayesDocumentClassifier.assignNormClasses(SimpleNaiveBayesDocumentClassifier.java:121)\n\tat
> >>>
> org.apache.lucene.classification.document.SimpleNaiveBayesDocumentClassifier.assignClass(SimpleNaiveBayesDocumentClassifier.java:81)\n\tat
> >>>
> org.apache.solr.update.processor.ClassificationUpdateProcessor.processAdd(ClassificationUpdateProcessor.java:94)\n\tat
> >>>
> org.apache.solr.handler.loader.JsonLoader$SingleThreadedJsonLoader.handleAdds(JsonLoader.java:474)\n\tat
> >>>
> org.apache.solr.handler.loader.JsonLoader$SingleThreadedJsonLoader.processUpdate(JsonLoader.java:138)\n\tat
> >>>
> org.apache.solr.handler.loader.JsonLoader$SingleThreadedJsonLoader.load(JsonLoader.java:114)\n\tat
> >>>
> org.apache.solr.handler.loader.JsonLoader.load(JsonLoader.java:77)\n\tat
> >>>
> org.apache.solr.handler.UpdateRequestHandler$1.load(UpdateRequestHandler.java:97)\n\tat
> >>>
> org.apache.solr.handler.ContentStreamHandlerBase.handleRequestBody(ContentStreamHandlerBase.java:69)\n\tat
> >>>
> org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:155)\n\tat
> >>> org.apache.solr.core.SolrCore.execute(SolrCore.java:2036)\n\tat
> >>>
> org.apache.solr.servlet.HttpSolrCall.execute(HttpSolrCall.java:657)\n\tat
> >>> org.apache.solr.servlet.HttpSolrCall.call(HttpSolrCall.java:464)\n\tat
> >>>
> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:257)\n\tat
> >>>
> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:208)\n\tat
> >>>
> org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1668)\n\tat
> >>>
> org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:581)\n\tat
> >>>
> org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:143)\n\tat
> >>>
> org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:548)\n\tat
> >>>
> org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:226)\n\tat
> >>>
> org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1160)\n\tat
> >>>
> org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:511)\n\tat
> >>>
> org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:185)\n\tat
> >>>
> org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1092)\n\tat
> >>>
> org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:141)\n\tat
> >>>
> org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:213)\n\tat
> >>>
> org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerCollection.java:119)\n\tat
> >>>
> 

Re: Changing scoring in RankQuery

2016-07-15 Thread Joel Bernstein
I think you'll have to handle the complexities you describe if you use the
RankQuery.

The RankQuery does bring with it a fair amount of complexity to get
everything right. The ReRankQParserPlugin provides a pretty good example to
review.

Joel Bernstein
http://joelsolr.blogspot.com/

On Fri, Jul 15, 2016 at 8:07 AM, Ankit Singh  wrote:

> Hi,
>
> I am trying to understand how to influence scoring via RankQuery in solr >=
> 4.9. My requirements are two fold:
>
> 1) Use a custom collector
> 2) Change scoring of docs
>
> As far as i could understand, if we just want to change the scoring, we can
> write a QueryParserPlugin and extend CustomScoreQuery to introduce a
> CustomScoreProvider and modify the customScore method so that whenever we
> call scorer.score(), this new custom scorer would be used. This works
> perfectly in isolation (not using RankQuery).
>
> However, i want to use a custom scorer in my query object extending
> RankQuery (where i am setting a custom collector). One way to do this is to
> override the setScorer method and initialise an object of the new scorer
> here. The problem i'm experiencing here is that to modify the scoring
> explanation, i would need to rewrite the weight class which is one
> complexity i am trying to avoid. I would need the same scorer in a
> post-filter i wrote, and going by this method, i would need to set the
> scorer even in the post-filter.
>
> Is there another way to change scoring of docs while also using RankQuery
> API, which lets us change the explanation in an easier way?
>
> Regards,
> Ankit Singh
>


Re: search documents that have a specific field populated

2016-07-15 Thread Erick Erickson
First of all I get an error message on startup with a field definition like:
default=EMPTY.

So I'm assuming you are puttint in quotes
default="EMPTY"

The query
q=field:[* TO *] should be fine, as is John's
query
fq:field:[* TO *]

So something's wonky here. I'd start by
deleting the entire index on the theory that
you're in a state you don't remember (i.e.
perhaps during some of your experiments
you now have docs with "EMPTY" in the field,
some blank and whatever else).

Second, don't provide any default for the field.

At that point q=field:[* TO *] should work. If it doesn't,
please post again with
1> a sample doc
2> the field definition exactly as in your schema
file (please copy/paste).
3> the results of adding =query to the URL.

Best,
Erick

On Fri, Jul 15, 2016 at 8:16 AM, John Blythe  wrote:
> we use something in between: fq=fieldName[* TO *]
>
> --
> *John Blythe*
> Product Manager & Lead Developer
>
> 251.605.3071 | j...@curvolabs.com
> www.curvolabs.com
>
> 58 Adams Ave
> Evansville, IN 47713
>
> On Fri, Jul 15, 2016 at 10:17 AM, Valentina Cavazza 
> wrote:
>
>> Hi,
>> I need to search documents that have a specific field populated, so I want
>> to display all the documents that have the field not empty.
>> This field in schema is set multivalued=true, indexed=true, stored=true,
>> default=EMPTY.
>> This field type is solr.TextField class, use StandardTokenizerFactory
>> tokenizer, ICUFoldingFilterFactory filter, LowerCaseFilterFactory filter
>> and GreekStemFilterFactory filter in index and query analizer.
>> I already tried queries like this:
>> q=field:*
>> q=+field:*
>> q=+field:[* TO *]
>> q=+field:['' TO *]
>> q=+field:["" TO *]
>> q=+field:[' ' TO *]
>> q=+field:' '
>> q=-field:EMPTY
>>
>> but nothing found.
>> Someone know how to do that?
>> Thanks
>>
>> Valentina
>>


Re: SimplePostTool error (UNCLASSIFIED)

2016-07-15 Thread Erick Erickson
simplePostTool is just that, simple. It's intended to get you started.
It is not a full-featured web crawler. As such, if you're encountering
wonky web pages that are not well formed HTML there's no guarantee
that it'll handle them gracefully.

Crawling websites is a pain, so if you require something robust
I'd investigate Nutch (which integrates with Solr/Lucene) or
similar.

Best,
Erick

On Fri, Jul 15, 2016 at 9:01 AM, Musshorn, Kris T CTR USARMY RDECOM
ARL (US)  wrote:
> CLASSIFICATION: UNCLASSIFIED
>
> How do I correct this error when running the simple post tool against a 
> website?
> The tool successfully indexed for about 30 mins before throwing this error 
> and terminating.
>
> [Fatal Error] :642:15: XML document structures must start and end within the 
> same entity.
> Exception in thread "main" java.lang.RuntimeException: 
> org.xml.sax.SAXParseException; lineNumber: 642; columnNumber: 15; XML 
> document structures must start and end within the same entity.
> at 
> org.apache.solr.util.SimplePostTool$PageFetcher.getLinksFromWebPage(SimplePostTool.java:1219)
> at 
> org.apache.solr.util.SimplePostTool.webCrawl(SimplePostTool.java:601)
> at 
> org.apache.solr.util.SimplePostTool.webCrawl(SimplePostTool.java:618)
> at 
> org.apache.solr.util.SimplePostTool.webCrawl(SimplePostTool.java:618)
> at 
> org.apache.solr.util.SimplePostTool.webCrawl(SimplePostTool.java:618)
> at 
> org.apache.solr.util.SimplePostTool.webCrawl(SimplePostTool.java:618)
> at 
> org.apache.solr.util.SimplePostTool.postWebPages(SimplePostTool.java:548)
> at 
> org.apache.solr.util.SimplePostTool.doWebMode(SimplePostTool.java:351)
> at 
> org.apache.solr.util.SimplePostTool.execute(SimplePostTool.java:182)
> at org.apache.solr.util.SimplePostTool.main(SimplePostTool.java:167)
> Caused by: org.xml.sax.SAXParseException; lineNumber: 642; columnNumber: 15; 
> XML document structures must start and end within the same entity.
> at 
> com.sun.org.apache.xerces.internal.parsers.DOMParser.parse(DOMParser.java:257)
> at 
> com.sun.org.apache.xerces.internal.jaxp.DocumentBuilderImpl.parse(DocumentBuilderImpl.java:339)
> at javax.xml.parsers.DocumentBuilder.parse(DocumentBuilder.java:121)
> at 
> org.apache.solr.util.SimplePostTool.makeDom(SimplePostTool.java:1028)
> at 
> org.apache.solr.util.SimplePostTool$PageFetcher.getLinksFromWebPage(SimplePostTool.java:1201)
> ... 9 more
>
> Thanks,
> Kris
>
> ~~
> Kris T. Musshorn
> FileMaker Developer - Contractor - Catapult Technology Inc.
> US Army Research Lab
> Aberdeen Proving Ground
> Application Management & Development Branch
> 410-278-7251
> kris.t.musshorn@mail.mil
> ~~
>
>
>
> CLASSIFICATION: UNCLASSIFIED


SimplePostTool error (UNCLASSIFIED)

2016-07-15 Thread Musshorn, Kris T CTR USARMY RDECOM ARL (US)
CLASSIFICATION: UNCLASSIFIED

How do I correct this error when running the simple post tool against a website?
The tool successfully indexed for about 30 mins before throwing this error and 
terminating.

[Fatal Error] :642:15: XML document structures must start and end within the 
same entity.
Exception in thread "main" java.lang.RuntimeException: 
org.xml.sax.SAXParseException; lineNumber: 642; columnNumber: 15; XML document 
structures must start and end within the same entity.
at 
org.apache.solr.util.SimplePostTool$PageFetcher.getLinksFromWebPage(SimplePostTool.java:1219)
at org.apache.solr.util.SimplePostTool.webCrawl(SimplePostTool.java:601)
at org.apache.solr.util.SimplePostTool.webCrawl(SimplePostTool.java:618)
at org.apache.solr.util.SimplePostTool.webCrawl(SimplePostTool.java:618)
at org.apache.solr.util.SimplePostTool.webCrawl(SimplePostTool.java:618)
at org.apache.solr.util.SimplePostTool.webCrawl(SimplePostTool.java:618)
at 
org.apache.solr.util.SimplePostTool.postWebPages(SimplePostTool.java:548)
at 
org.apache.solr.util.SimplePostTool.doWebMode(SimplePostTool.java:351)
at org.apache.solr.util.SimplePostTool.execute(SimplePostTool.java:182)
at org.apache.solr.util.SimplePostTool.main(SimplePostTool.java:167)
Caused by: org.xml.sax.SAXParseException; lineNumber: 642; columnNumber: 15; 
XML document structures must start and end within the same entity.
at 
com.sun.org.apache.xerces.internal.parsers.DOMParser.parse(DOMParser.java:257)
at 
com.sun.org.apache.xerces.internal.jaxp.DocumentBuilderImpl.parse(DocumentBuilderImpl.java:339)
at javax.xml.parsers.DocumentBuilder.parse(DocumentBuilder.java:121)
at org.apache.solr.util.SimplePostTool.makeDom(SimplePostTool.java:1028)
at 
org.apache.solr.util.SimplePostTool$PageFetcher.getLinksFromWebPage(SimplePostTool.java:1201)
... 9 more

Thanks,
Kris

~~
Kris T. Musshorn
FileMaker Developer - Contractor - Catapult Technology Inc.  
US Army Research Lab 
Aberdeen Proving Ground 
Application Management & Development Branch 
410-278-7251
kris.t.musshorn@mail.mil
~~



CLASSIFICATION: UNCLASSIFIED

Re: solrcloud so many connections

2016-07-15 Thread Kent Mu
hello, does anybody also come across the issue? can anybody help me?

with an addition, besides the query speed become slow. the speed of
writing index also becomes slow.

looking forward to your reply.

Thanks.

Kent


2016-07-14 13:01 GMT+08:00 Kent Mu :

> Hi friends!
>
> We are using Solrj 4.9.1 to connect to a Zookeeper. and the solr server
> version is 4.9.0 We are currently using CloudSolrServer as a singleton, I
> believe that solrj to zookeeper is a TCP connection, and zookeeper to
> solrcloud internal is actually a httpconnection.
>
> we use the zabbix to monitor the solrcloud status, and we deploy solr in
> Wildfly, for example the port is 8180, we find that the number that
> connecting with solr on port 8180 is so high. for now  we find the number
> can be around 4000, that is too large.
>
> and we find that with the increasing connections, the query speed become
> slow.
>
> does anyone come across this issue too?
>
> look forward to your reply.
>
> Thanks.
> Kent
>


Changing scoring in RankQuery

2016-07-15 Thread Ankit Singh
Hi,

I am trying to understand how to influence scoring via RankQuery in solr >=
4.9. My requirements are two fold:

1) Use a custom collector
2) Change scoring of docs

As far as i could understand, if we just want to change the scoring, we can
write a QueryParserPlugin and extend CustomScoreQuery to introduce a
CustomScoreProvider and modify the customScore method so that whenever we
call scorer.score(), this new custom scorer would be used. This works
perfectly in isolation (not using RankQuery).

However, i want to use a custom scorer in my query object extending
RankQuery (where i am setting a custom collector). One way to do this is to
override the setScorer method and initialise an object of the new scorer
here. The problem i'm experiencing here is that to modify the scoring
explanation, i would need to rewrite the weight class which is one
complexity i am trying to avoid. I would need the same scorer in a
post-filter i wrote, and going by this method, i would need to set the
scorer even in the post-filter.

Is there another way to change scoring of docs while also using RankQuery
API, which lets us change the explanation in an easier way?

Regards,
Ankit Singh


Re: search documents that have a specific field populated

2016-07-15 Thread John Blythe
we use something in between: fq=fieldName[* TO *]

-- 
*John Blythe*
Product Manager & Lead Developer

251.605.3071 | j...@curvolabs.com
www.curvolabs.com

58 Adams Ave
Evansville, IN 47713

On Fri, Jul 15, 2016 at 10:17 AM, Valentina Cavazza 
wrote:

> Hi,
> I need to search documents that have a specific field populated, so I want
> to display all the documents that have the field not empty.
> This field in schema is set multivalued=true, indexed=true, stored=true,
> default=EMPTY.
> This field type is solr.TextField class, use StandardTokenizerFactory
> tokenizer, ICUFoldingFilterFactory filter, LowerCaseFilterFactory filter
> and GreekStemFilterFactory filter in index and query analizer.
> I already tried queries like this:
> q=field:*
> q=+field:*
> q=+field:[* TO *]
> q=+field:['' TO *]
> q=+field:["" TO *]
> q=+field:[' ' TO *]
> q=+field:' '
> q=-field:EMPTY
>
> but nothing found.
> Someone know how to do that?
> Thanks
>
> Valentina
>


RE: search documents that have a specific field populated

2016-07-15 Thread Jamal, Sarfaraz
If I understand you properly, I do it using a Filter Query:

fq=NOT(field:EMPTY)

Hope that helps -

Sas


-Original Message-
From: Valentina Cavazza [mailto:valent...@step-net.it] 
Sent: Friday, July 15, 2016 10:17 AM
To: solr-user@lucene.apache.org
Subject: search documents that have a specific field populated

Hi,
I need to search documents that have a specific field populated, so I want to 
display all the documents that have the field not empty.
This field in schema is set multivalued=true, indexed=true, stored=true, 
default=EMPTY.
This field type is solr.TextField class, use StandardTokenizerFactory 
tokenizer, ICUFoldingFilterFactory filter, LowerCaseFilterFactory filter and 
GreekStemFilterFactory filter in index and query analizer.
I already tried queries like this:
q=field:*
q=+field:*
q=+field:[* TO *]
q=+field:['' TO *]
q=+field:["" TO *]
q=+field:[' ' TO *]
q=+field:' '
q=-field:EMPTY

but nothing found.
Someone know how to do that?
Thanks

Valentina


search documents that have a specific field populated

2016-07-15 Thread Valentina Cavazza

Hi,
I need to search documents that have a specific field populated, so I 
want to display all the documents that have the field not empty.
This field in schema is set multivalued=true, indexed=true, stored=true, 
default=EMPTY.
This field type is solr.TextField class, use StandardTokenizerFactory 
tokenizer, ICUFoldingFilterFactory filter, LowerCaseFilterFactory filter 
and GreekStemFilterFactory filter in index and query analizer.

I already tried queries like this:
q=field:*
q=+field:*
q=+field:[* TO *]
q=+field:['' TO *]
q=+field:["" TO *]
q=+field:[' ' TO *]
q=+field:' '
q=-field:EMPTY

but nothing found.
Someone know how to do that?
Thanks

Valentina


RE: How to create highlight search component using Config API

2016-07-15 Thread Alexandre Drouin
Hi,

Thanks for info, however I got an error when adding the field using curl (sorry 
for the long lines):
Command I used:
curl -k -u user:password https://solr:8443/solr/q/config -H 
'Content-type:application/json'  -d '{ "add-searchcomponent": { "highlight": { 
"name": "myHighlight", "class": "srl'.HighlightComponent", "": { "gap": { 
"default": "true", "name": "gap", "class": "solr.highlight.GapFragmente,"' 
"defaults": { "hl.fragsize": 100 } } }, "html": [{ "default": "true", "name": 
"html", "class": "solr.highlith'.HtmlFormatter", "defaults": { "hl.simple.pre": 
"before-", "hl.simple.post": "-after" }}, { "name": "html", "class": 
"solr.highlight.HtmlEncoder"}] }}}'

The error I got was : 
'name' is a required field", "'class' is a required field"

I was able to figure out the correct format:
curl -k -u user:password https://solr:8443/solr/q/config -H 
'Content-type:application/json'  -d '{ "add-searchcomponent": { "name": 
"highlight", "class": "solr.HighlightComponent", "": { "gap": { "default": 
"true", "name": "gap", "class": "solr.highlight.GapFragmenter", "defaults": { 
"hl.fragsize": 100 } } }, "html": [{ "default": "true", "name": "html", 
"class": "solr.highlight.HtmlFormatter", "defaults": { "hl.simple.pre": 
before-", "hl.simple.post": "-after" }}, { "name": "html", "class": 
"solr.highlight.HtmlEncoder"}] }}'

It was accepted by Solr and I can see it when going to the url 
"solr//config".  As you can see I use the 'highlight' name to 
override the default one however my highlighter is not picked up by my request 
handler.  If I remove my highlight search component from the API and add it to 
solrconfig.xml it is picked up by my request handler.  

I compared the output of the url "solr//config" (solrconfig.xml 
version) and it was identical to what I had before when I added the search 
component using the API.  I am at a loss why the search component works  when 
using solrconfig.xml but doesn't when using the Config API.  Do you know why 
this is the case?

I am using Solr 6.0.1 with ZooKeeper.

Thanks for any help

Alexandre Drouin

-Original Message-
From: Cassandra Targett [mailto:casstarg...@gmail.com] 
Sent: July 8, 2016 5:04 PM
To: solr-user@lucene.apache.org
Subject: Re: How to create highlight search component using Config API

If you already have highlighting defined from one of the default configsets, 
you can see an example of how the JSON is structured with a Config API request. 
I assume you already tried that, but pointing it out just in case.

Defining a highlighter with the Config API is a bit confusing to be honest, but 
I worked out something that works:

{"add-searchcomponent": {"highlight": {"name":"myHighlight",
"class":"solr.HighlightComponent","": {"gap": {"default":"true",
"name": "gap", "class":"solr.highlight.GapFragmenter",
"defaults":{"hl.fragsize":100}}},"html":[{"default": "true","name":
"html","class": "solr.highlight.HtmlFormatter","defaults":
{"hl.simple.pre":"",
"hl.simple.post":""}},{"name": "html","class":
"solr.highlight.HtmlEncoder"}]}}}

Note there is an empty string after the initial class definition (shown as ""). 
That lets you then add the fragmenters.

(I tried to prettify that, but my mail client isn't cooperating. I'm going to 
add this example to the Solr Ref Guide, though so it might be easier to see 
there in a few minutes.)

Hope it helps -
Cassandra

On Wed, Jun 29, 2016 at 8:00 AM, Alexandre Drouin 
 wrote:
> Hi,
>
> I'm trying to create a highlight search component using the Config API of 
> Solr 6.0.1 however I cannot figure out how to include the elements 
> fragmenter, formatter, encoder, etc...
>
> Let's say I have the following component:
>
>name="myHighlightingComponent">
> 
>class="solr.highlight.GapFragmenter">
> 
>   100
> 
>   
>class="solr.highlight.HtmlFormatter">
> 
>   
>   
> 
>   
>   
> 
>   
>
> From what I can see from the documentation my JSON should look a bit like 
> this:
>
> {
>   "add-searchcomponent":{
> "name":"myHighlightingComponent",
> "class":"solr.HighlightComponent",
> ??
>   }
> }
>
> However I have no idea how to defines the 2 fragmenters or the encoder.  Any 
> help is appreciated.
>
> Thanks
> Alex
>


Re: A working example to play with Naive Bayes classifier

2016-07-15 Thread Tomas Ramanauskas
Hi, Allesandro,

sorry for the delay. What do you mean?


As I mentioned earlier, I followed a super simply set of steps.

1. Download Solr
2. Configure classification 
3. Create some documents using curl over HTTP.


Is it difficult to reproduce the steps / problem?


Tomas



> On 23 Jun 2016, at 16:42, Alessandro Benedetti  
> wrote:
> 
> Can you give an example of your schema, and can you run a simple query for
> you index, curious to see how the input fields are analyzed.
> 
> Cheers
> 
> On Wed, Jun 22, 2016 at 6:05 PM, Alessandro Benedetti <
> benedetti.ale...@gmail.com> wrote:
> 
>> This is better!  At list the classifier is invoked!
>> How many docs in the index have the class assigned?
>> Take a look to the stacktrace and you should find the cause!
>> I am now on mobile, I will check the code tomorrow!
>> Cheers
>> On 22 Jun 2016 5:26 pm, "Tomas Ramanauskas" <
>> tomas.ramanaus...@springer.com> wrote:
>> 
>>> 
>>> I also tried with this config (adding **):
>>> 
>>> 
>>>  
>>>
>>>  classification
>>>
>>>  
>>> 
>>> 
>>> 
>>> 
>>> 
>>> And I get the error:
>>> 
>>> 
>>> 
>>> $ curl http://localhost:8983/solr/demo/update -d '
>>> [
>>> {"id" : "book15",
>>> "title_t":["The Way of Kings"],
>>> "author_s":"Brandon Sanderson",
>>> "cat_s": null,
>>> "pubyear_i":2010,
>>> "ISBN_s":"978-0-7653-2635-5"
>>> }
>>> ]'
>>> {"responseHeader":{"status":500,"QTime":29},"error":{"trace":"java.lang.NullPointerException\n\tat
>>> org.apache.lucene.classification.document.SimpleNaiveBayesDocumentClassifier.getTokenArray(SimpleNaiveBayesDocumentClassifier.java:202)\n\tat
>>> org.apache.lucene.classification.document.SimpleNaiveBayesDocumentClassifier.analyzeSeedDocument(SimpleNaiveBayesDocumentClassifier.java:162)\n\tat
>>> org.apache.lucene.classification.document.SimpleNaiveBayesDocumentClassifier.assignNormClasses(SimpleNaiveBayesDocumentClassifier.java:121)\n\tat
>>> org.apache.lucene.classification.document.SimpleNaiveBayesDocumentClassifier.assignClass(SimpleNaiveBayesDocumentClassifier.java:81)\n\tat
>>> org.apache.solr.update.processor.ClassificationUpdateProcessor.processAdd(ClassificationUpdateProcessor.java:94)\n\tat
>>> org.apache.solr.handler.loader.JsonLoader$SingleThreadedJsonLoader.handleAdds(JsonLoader.java:474)\n\tat
>>> org.apache.solr.handler.loader.JsonLoader$SingleThreadedJsonLoader.processUpdate(JsonLoader.java:138)\n\tat
>>> org.apache.solr.handler.loader.JsonLoader$SingleThreadedJsonLoader.load(JsonLoader.java:114)\n\tat
>>> org.apache.solr.handler.loader.JsonLoader.load(JsonLoader.java:77)\n\tat
>>> org.apache.solr.handler.UpdateRequestHandler$1.load(UpdateRequestHandler.java:97)\n\tat
>>> org.apache.solr.handler.ContentStreamHandlerBase.handleRequestBody(ContentStreamHandlerBase.java:69)\n\tat
>>> org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:155)\n\tat
>>> org.apache.solr.core.SolrCore.execute(SolrCore.java:2036)\n\tat
>>> org.apache.solr.servlet.HttpSolrCall.execute(HttpSolrCall.java:657)\n\tat
>>> org.apache.solr.servlet.HttpSolrCall.call(HttpSolrCall.java:464)\n\tat
>>> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:257)\n\tat
>>> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:208)\n\tat
>>> org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1668)\n\tat
>>> org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:581)\n\tat
>>> org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:143)\n\tat
>>> org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:548)\n\tat
>>> org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:226)\n\tat
>>> org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1160)\n\tat
>>> org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:511)\n\tat
>>> org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:185)\n\tat
>>> org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1092)\n\tat
>>> org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:141)\n\tat
>>> org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:213)\n\tat
>>> org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerCollection.java:119)\n\tat
>>> org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:134)\n\tat
>>> org.eclipse.jetty.server.Server.handle(Server.java:518)\n\tat
>>> org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.java:308)\n\tat
>>> org.eclipse.jetty.server.HttpConnection.onFillable(HttpConnection.java:244)\n\tat
>>> org.eclipse.jetty.io.AbstractConnection$ReadCallback.succeeded(AbstractConnection.java:273)\n\tat
>>> org.eclipse.jetty.io.FillInterest.fillable(FillInterest.java:95)\n\tat
>>> org.eclipse.jetty.io.SelectChannelEndPoint$2.run(SelectChannelEndPoint.java:93)\n\tat
>>>