Re: Remove operation of partial update doesn't work

2015-07-08 Thread Mohsen Saboorian
In my code when operation is add it works correctly on a multivalue
field. But no multivalue field can be deleted with remove operation.
The add operation adds a value to a multivaled field. The remove operation
removes a value from a multivalued field.

If you believe that something is not working, please state clearly why you
believe that something is not working. Start by describing the symptom.


-- Jack Krupansky

On Mon, Jul 6, 2015 at 9:22 PM, Mohsen Saboorian mohs...@gmail.com wrote:

 I can partially 'add' fields to my Solr index, but 'remove' operation
seems
 not working. I'm on Solr 4.10. Here is my SolrJ snippet:

 SolrInputDocument doc = new SolrInputDocument();
 MapString, Object partialUpdate = new HashMap();
 partialUpdate.put(operation, value); // value can be object
 (string, number, etc) or list. operation can be add, set or remove.
 doc.addField(id, id); // document id
 doc.addField(fieldName, partialUpdate);

 getSolrServer().add(doc, commitWithin);

 Is there anything wrong with my code?



Re: Too many Soft commits and opening searchers realtime

2015-07-08 Thread Mikhail Khludnev
Summer,

A log excerpt usually helps to troubleshoot any magic. Would you mind to
provide one?

On Wed, Jul 8, 2015 at 2:30 PM, Alessandro Benedetti 
benedetti.ale...@gmail.com wrote:

 So you are saying that no-one is triggering any commit, and that the auto
 soft commit solution is not actually waiting the proper time ?
 I suspect something is not like described, because if the Auto Soft commit
 was not working I would expect thousands of bugs raised.

 let's dig a little bit into details…
 What are you using exactly to index content ?
 Maybe some commit is actually hidden there :)

 Cheers

 2015-07-08 2:21 GMT+01:00 Summer Shire shiresum...@gmail.com:

  No the client lets solr handle it.
 
 
   On Jul 7, 2015, at 2:38 PM, Mike Drob mad...@cloudera.com wrote:
  
   Are the clients that are posting updates requesting commits?
  
   On Tue, Jul 7, 2015 at 4:29 PM, Summer Shire shiresum...@gmail.com
  wrote:
  
   HI All,
  
   Can someone help me understand the following behavior.
   I have the following maxTimes on hard and soft commits
  
   yet I see a lot of Opening Searchers in the log
   org.apache.solr.search.SolrIndexSearcher - Opening Searcher@1656a258
  [main]
   realtime
   also I see a soft commit happening almost every 30 secs
   org.apache.solr.update.UpdateHandler - start
  
 
 commit{,optimize=false,openSearcher=true,waitSearcher=true,expungeDeletes=false,softCommit=true,prepareCommit=false}
   autoCommit
   maxTime48/maxTime
   openSearcherfalse/openSearcher
   /autoCommit
  
   autoSoftCommit
   maxTime18/maxTime
   /autoSoftCommit
   I tried disabling softCommit by setting maxTime to -1.
   On startup solrCore recognized it and logged Soft AutoCommit:
 disabled
   but I could still see softCommit=true
   org.apache.solr.update.UpdateHandler - start
  
 
 commit{,optimize=false,openSearcher=true,waitSearcher=true,expungeDeletes=false,softCommit=true,prepareCommit=false}
   autoSoftCommit
   maxTime-1/maxTime
   /autoSoftCommit
  
   Thanks,
   Summer
 



 --
 --

 Benedetti Alessandro
 Visiting card : http://about.me/alessandro_benedetti

 Tyger, tyger burning bright
 In the forests of the night,
 What immortal hand or eye
 Could frame thy fearful symmetry?

 William Blake - Songs of Experience -1794 England




-- 
Sincerely yours
Mikhail Khludnev
Principal Engineer,
Grid Dynamics

http://www.griddynamics.com
mkhlud...@griddynamics.com


Re: Too many Soft commits and opening searchers realtime

2015-07-08 Thread Yonik Seeley
A realtime searcher is necessary for internal bookkeeping / uses if a
normal searcher isn't opened on a commit.
This searcher doesn't have caches and hence doesn't carry the weight
that a normal searcher would.  It's also invisible to clients (it
doesn't change the view of the index for normal searches).

Your hard autocommit at 8 minutes with openSearcher=false will trigger
a realtime searcher to open on every 8 minutes along with the hard
commit.

-Yonik


On Tue, Jul 7, 2015 at 5:29 PM, Summer Shire shiresum...@gmail.com wrote:
 HI All,

 Can someone help me understand the following behavior.
 I have the following maxTimes on hard and soft commits

 yet I see a lot of Opening Searchers in the log
 org.apache.solr.search.SolrIndexSearcher- Opening Searcher@1656a258[main] 
 realtime
 also I see a soft commit happening almost every 30 secs
 org.apache.solr.update.UpdateHandler - start 
 commit{,optimize=false,openSearcher=true,waitSearcher=true,expungeDeletes=false,softCommit=true,prepareCommit=false}
 autoCommit
 maxTime48/maxTime
 openSearcherfalse/openSearcher
 /autoCommit

 autoSoftCommit
 maxTime18/maxTime
 /autoSoftCommit
 I tried disabling softCommit by setting maxTime to -1.
 On startup solrCore recognized it and logged Soft AutoCommit: disabled
 but I could still see softCommit=true
 org.apache.solr.update.UpdateHandler - start 
 commit{,optimize=false,openSearcher=true,waitSearcher=true,expungeDeletes=false,softCommit=true,prepareCommit=false}
 autoSoftCommit
 maxTime-1/maxTime
 /autoSoftCommit

 Thanks,
 Summer


Re: Solr Boost Search word before Specific Content

2015-07-08 Thread Alessandro Benedetti
Thanks Ahmet for the proposed Solution, that should work, but it is really
hardcoded and coupled with the specific keyword ( with in the example) .

I recently read an article from master Doug (
http://opensourceconnections.com/blog/2014/12/08/title-search-when-relevancy-is-only-skin-deep/
) .

I do believe this is the point you should start with.

In particular take extra care of the Pantheon approach, which can be
really useful to you.

 … We can use our pantheon along with a KeepWordsFilter
http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/analysis-keep-words-tokenfilter.html
 to create yet another search field to use in our search. We can create a
“keep words” list that contains the terms in our pantheon. Only terms in
our list make it into the search index. We can call this field
pantheon_title. For example, when the following title is analyzed to go
into the index:

   - Who was Socrates

we will strip out all terms other than the ones in our pantheon:

   - Socrates

Similarly the title

   - Socrates and Plato on Metaphysics

Can be boiled down to these three members of our pantheon:

   - Socrates Plato Metaphysics

Hope this can help !

Cheers

2015-07-08 8:09 GMT+01:00 Ahmet Arslan iori...@yahoo.com.invalid:

 Hi Jack,

 Here is hypothetical example:

 product_title_1 : dell laptop with laptop bag

 product_title_2 : laptop bag with cover
 product_title_3 : laptop bag and table

 You create an artificial/additional field,

 before_field_1 : dell laptop
 before_field_2 : laptop bag
 before_field_3 : laptop bag

 You can implement/embed any complex/custom logic (to indexing code) for
 obtaining values of this new boostable before_field.
 You can even implement it in a custom update processor.

 Then, at search time, use (e)Dismax's field boosting mechanism q=Laptop
 bagqf=product_title^0.3 before_field^0.7defType=edismax

 Ahmet



 On Wednesday, July 8, 2015 6:56 AM, JACK mfal...@gmail.com wrote:
 Hi Ahmet,

 Can you elaborate it more?
 Is it possible to solve my problem in Solr 5.0.0?
 if yes can just explain how?



 --
 View this message in context:
 http://lucene.472066.n3.nabble.com/Solr-Boost-Search-word-before-Specific-Content-tp4216072p4216257.html

 Sent from the Solr - User mailing list archive at Nabble.com.




-- 
--

Benedetti Alessandro
Visiting card : http://about.me/alessandro_benedetti

Tyger, tyger burning bright
In the forests of the night,
What immortal hand or eye
Could frame thy fearful symmetry?

William Blake - Songs of Experience -1794 England


Re: Remove operation of partial update doesn't work

2015-07-08 Thread Alessandro Benedetti
In this scenarios, Documentation is key :

Modifier

Usage

set

Set or replace the field value(s) with the specified value(s), or *remove
the values if 'null' or empty list is specified as the new value.*

May be specified as a single value, or as a list for multivalued fields

add

Adds the specified values to a multivalued field.

May be specified as a single value, or as a list.
remove

Removes (all occurrences of) the specified values from a multivalued field.

May be specified as a single value, or as a list.
removeregex

Removes all occurrences of the specified regex from a multiValued field.

May be specified as a single value, or as a list.

inc

Increments a numeric value by a specific amount.

Must be specified as a single numeric value.

In my opinion set is the right direction to look into.
Not sure what happens if you use the remove to remove only a single valued
field value.
Can you explain us what you noticed ? An empty value remain for that field ?
It is kind of weird, I would expect the field to become null.

Cheers


2015-07-08 10:34 GMT+01:00 Mohsen Saboorian mohs...@gmail.com:

 In my code when operation is add it works correctly on a multivalue
 field. But no multivalue field can be deleted with remove operation.
 The add operation adds a value to a multivaled field. The remove operation
 removes a value from a multivalued field.

 If you believe that something is not working, please state clearly why you
 believe that something is not working. Start by describing the symptom.


 -- Jack Krupansky

 On Mon, Jul 6, 2015 at 9:22 PM, Mohsen Saboorian mohs...@gmail.com
 wrote:

  I can partially 'add' fields to my Solr index, but 'remove' operation
 seems
  not working. I'm on Solr 4.10. Here is my SolrJ snippet:
 
  SolrInputDocument doc = new SolrInputDocument();
  MapString, Object partialUpdate = new HashMap();
  partialUpdate.put(operation, value); // value can be object
  (string, number, etc) or list. operation can be add, set or remove.
  doc.addField(id, id); // document id
  doc.addField(fieldName, partialUpdate);
 
  getSolrServer().add(doc, commitWithin);
 
  Is there anything wrong with my code?
 




-- 
--

Benedetti Alessandro
Visiting card : http://about.me/alessandro_benedetti

Tyger, tyger burning bright
In the forests of the night,
What immortal hand or eye
Could frame thy fearful symmetry?

William Blake - Songs of Experience -1794 England


Re: Sorting documents by child documents

2015-07-08 Thread Alessandro Benedetti
I would like to get a deep understanding of your problem…
How do you want to sort a parent document by a normal field of children ??

Example :

Document 1
 Id: 5
 Children 1
 Id:51
 Title : A
 Children 2
 Id:52
 Title : Z

Document 2
 Id: 6
 Children 1
 Id:61
 Title : C
 Children 2
 Id:62
 Title : B

How can you sort the parent based on children fields ?
You can sort a parent based on a value calculated out of children fields (
after you calculate an unique value out of them Max ? Sum ? Concat ? ext
ext).

Can you explain better your problem ?

Cheers


2015-07-08 7:17 GMT+01:00 DorZion dorz...@gmail.com:

 Hey,

 I'm using Solr 4.10.2 and I have child documents in every parent document.

 Previously, I used FunctionQuery to sort the documents:


 http://lucene.472066.n3.nabble.com/Sorting-documents-by-nested-child-docs-with-FunctionQueries-tp4209940.html
 
 http://lucene.472066.n3.nabble.com/Sorting-documents-by-nested-child-docs-with-FunctionQueries-tp4209940.html
 

 Now, I want to sort the documents by their child documents with normal
 fields.

 It doesn't work when I use the sort parameter.

 Thanks in advance,

 Dor



 --
 View this message in context:
 http://lucene.472066.n3.nabble.com/Sorting-documents-by-child-documents-tp4216263.html
 Sent from the Solr - User mailing list archive at Nabble.com.




-- 
--

Benedetti Alessandro
Visiting card : http://about.me/alessandro_benedetti

Tyger, tyger burning bright
In the forests of the night,
What immortal hand or eye
Could frame thy fearful symmetry?

William Blake - Songs of Experience -1794 England


Re: Too many Soft commits and opening searchers realtime

2015-07-08 Thread Alessandro Benedetti
So you are saying that no-one is triggering any commit, and that the auto
soft commit solution is not actually waiting the proper time ?
I suspect something is not like described, because if the Auto Soft commit
was not working I would expect thousands of bugs raised.

let's dig a little bit into details…
What are you using exactly to index content ?
Maybe some commit is actually hidden there :)

Cheers

2015-07-08 2:21 GMT+01:00 Summer Shire shiresum...@gmail.com:

 No the client lets solr handle it.


  On Jul 7, 2015, at 2:38 PM, Mike Drob mad...@cloudera.com wrote:
 
  Are the clients that are posting updates requesting commits?
 
  On Tue, Jul 7, 2015 at 4:29 PM, Summer Shire shiresum...@gmail.com
 wrote:
 
  HI All,
 
  Can someone help me understand the following behavior.
  I have the following maxTimes on hard and soft commits
 
  yet I see a lot of Opening Searchers in the log
  org.apache.solr.search.SolrIndexSearcher - Opening Searcher@1656a258
 [main]
  realtime
  also I see a soft commit happening almost every 30 secs
  org.apache.solr.update.UpdateHandler - start
 
 commit{,optimize=false,openSearcher=true,waitSearcher=true,expungeDeletes=false,softCommit=true,prepareCommit=false}
  autoCommit
  maxTime48/maxTime
  openSearcherfalse/openSearcher
  /autoCommit
 
  autoSoftCommit
  maxTime18/maxTime
  /autoSoftCommit
  I tried disabling softCommit by setting maxTime to -1.
  On startup solrCore recognized it and logged Soft AutoCommit: disabled
  but I could still see softCommit=true
  org.apache.solr.update.UpdateHandler - start
 
 commit{,optimize=false,openSearcher=true,waitSearcher=true,expungeDeletes=false,softCommit=true,prepareCommit=false}
  autoSoftCommit
  maxTime-1/maxTime
  /autoSoftCommit
 
  Thanks,
  Summer




-- 
--

Benedetti Alessandro
Visiting card : http://about.me/alessandro_benedetti

Tyger, tyger burning bright
In the forests of the night,
What immortal hand or eye
Could frame thy fearful symmetry?

William Blake - Songs of Experience -1794 England


solr 5 and schema.xml

2015-07-08 Thread spleenboy
Had a look at previous postings, but am still thoroughly confused.
I installed Solr 5 out of the box, built a core and uploaded some
documents using dynamic field types.
I can see my uploaded docs using the get method. When I query those docs,
results seem all over the place. 
The answer seems to be to alter my schema.xml file, but it doesn't appear to
be in conf directory where everyone seems to be directing me to.
I've then read that solr 5 doesn't by default use the schema.xml file, but
is using a managed schema by default. Apparently, I can't alter the
schema.xml file (which I can't find) but now need to use a REST api.
However, since I'm using dynamic fields, I'm not sure if this is still
necessary. 
I've hunted high and low for clear documentation on this, but am still
confused. I need to build a single index based upon customer data, searching
by email address.
Any help, or pointing in the right direction to where this is clearly
documented would be gratefully received.




--
View this message in context: 
http://lucene.472066.n3.nabble.com/solr-5-and-schema-xml-tp4216290.html
Sent from the Solr - User mailing list archive at Nabble.com.


RE: Running Solr 5.2.1 on WIndows using NSSM

2015-07-08 Thread Adrian Liew
Answered my own question. :) It seems to work great for me by following this 
article.

http://www.norconex.com/how-to-run-solr5-as-a-service-on-windows/

Regards,
Adrian

-Original Message-
From: Adrian Liew [mailto:adrian.l...@avanade.com] 
Sent: Wednesday, July 8, 2015 4:43 PM
To: solr-user@lucene.apache.org
Subject: Running Solr 5.2.1 on WIndows using NSSM 

Hi guys,

I am looking to run Apache Solr v5.2.1 on a windows machine. I tried to setup a 
windows service using NSSM (Non-Sucking-Service-Manager) to install the windows 
service on the machine pointing to the solr.cmd file path itself and installing 
the service.

After installation, I tried to start the windows service but it gives back an 
alert message. It says \Windows could not start the SolrService service on 
Local Computer. The service  did not return an error. This could be an internal 
Windows error or an internal service error.

Most of the examples of older Apache Solr uses the java -start start.jar 
command to run Solr and seem to run okay with nssm. I am not sure if this could 
be the solr.cmd issue or NSSM's issue.

Alternatively, I have tried to use Windows Task Scheduler to configure a task 
to point to the solr.cmd as well and run task whenever the computer starts 
(regardless a user is logged in or not). The task scheduler seems to report 
back 'Task Start Failed' with Level of 'Error'.

Additionally, after checking Event Viewer, it returns the error with nssm 
Failed to open process handle for process with PID 3640 when terminating 
service Solr Service : The parameter is incorrect.

Chances this can point back to the solr.cmd file itself.

Thoughts?

Regards,
Adrian



Re: solr 5 and schema.xml

2015-07-08 Thread Alexandre Rafalovitch
You have the choice. You can use dynamic schema and control it using
API or use classic schema and control it explicitly via schema.xml.
You control that when you create the schema by using different
templates. It's just the default one is a dynamic schema.

Also, dynamic fields is not the same as dynamic schema, but I think
you knew that. You can use dynamic fields with either one of them.

So, try something like this:
bin/solr create_core -c classic_core -d basic_configs

Regards,
Alex.
P.s. You still get some APIs even with classic schema. But that's more
for overriding solrconfig.xml settings.


Solr Analyzers, Tokenizers, Filters, URPs and even a newsletter:
http://www.solr-start.com/


On 8 July 2015 at 05:11, spleenboy paul.br...@neilltech.com wrote:
 Had a look at previous postings, but am still thoroughly confused.
 I installed Solr 5 out of the box, built a core and uploaded some
 documents using dynamic field types.
 I can see my uploaded docs using the get method. When I query those docs,
 results seem all over the place.
 The answer seems to be to alter my schema.xml file, but it doesn't appear to
 be in conf directory where everyone seems to be directing me to.
 I've then read that solr 5 doesn't by default use the schema.xml file, but
 is using a managed schema by default. Apparently, I can't alter the
 schema.xml file (which I can't find) but now need to use a REST api.
 However, since I'm using dynamic fields, I'm not sure if this is still
 necessary.
 I've hunted high and low for clear documentation on this, but am still
 confused. I need to build a single index based upon customer data, searching
 by email address.
 Any help, or pointing in the right direction to where this is clearly
 documented would be gratefully received.




 --
 View this message in context: 
 http://lucene.472066.n3.nabble.com/solr-5-and-schema-xml-tp4216290.html
 Sent from the Solr - User mailing list archive at Nabble.com.


Re: Indexed field to schema field

2015-07-08 Thread Gajendra Dadheech
I wish to do it in code so schema browser is lesser of an option.

Use case is  : I wish to boost particular fields while matching, for that i
need to know My field to Solr field mapping. SO that i can put that in the
query.


Thanks and regards,
Gajendra Dadheech


On Tue, Jul 7, 2015 at 9:23 PM, Erick Erickson erickerick...@gmail.com
wrote:

 Feels like an XY problem. Why do you want to do this? What's
 the use-case? Perhaps there's an alternative approach that
 satisfies the need.

 Best,
 Erick

 On Tue, Jul 7, 2015 at 4:21 AM, Mikhail Khludnev
 mkhlud...@griddynamics.com wrote:
  Just an idea, Solr Admin/Schema Browser reports some info like this,
 hence,
  you can trace the way in which it does it.
 
  On Tue, Jul 7, 2015 at 10:34 AM, Gajendra Dadheech gajju3...@gmail.com
  wrote:
 
  Hi,
 
  Can i some how translate fields which i read from
  newSearcher.getAtomicReader().fields(), to schema fields ? Does solr
 expose
  any method to do this translation ? Alternative approach i am thinking
 will
  involved lots of regex computation as the fields would be _string,
 _float
  etc and i would have to remove those suffixes, this becomes little
 tricky
  when fields are dynamic.
 
  Thanks and regards,
  Gajendra Dadheech
 
 
 
 
  --
  Sincerely yours
  Mikhail Khludnev
  Principal Engineer,
  Grid Dynamics
 
  http://www.griddynamics.com
  mkhlud...@griddynamics.com



Grouping and recip function not working with Sharding

2015-07-08 Thread Pankaj Sonawane
Hi,

I am using sharding (3 shards) with Zookeeper.

When I query a collection using 
*group=truegroup.field=NAMEgroup.ngroups=true* parameters, *ngroups* in
response is incorrect. However I am getting correct count in doclist array.

Ex: Below response contains 5 groups (Which is correct) but ngroups is 11.

{
   responseHeader:{
  status:0,
  QTime:49,
  params:{
 group.ngroups:true,
 indent:true,
 start:0,
 q:*:*,
 group.field:NAME,
 group:true,
 wt:json,
 rows:5
  }
   },
   grouped:{
  NAME:{
 matches:18,
 ngroups:11,
 groups:[
{
   groupValue:A-SERIES,
   doclist:{
  numFound:5,
  start:0,
  maxScore:1,
  docs:[
 {
NAME:A-SERIES,
_version_:1505559209034383400
 }
  ]
   }
},
{
   groupValue:B-SERIES,
   doclist:{
  numFound:5,
  start:0,
  docs:[
 {
NAME:B-SERIES,
_version_:1505559209034383400
 }
  ]
   }
},
{
   groupValue:C-SERIES,
   doclist:{
  numFound:1,
  start:0,
  docs:[
 {
NAME:C-SERIES,
_version_:1505559209034383400
 }
  ]
   }
},
{
   groupValue:D-SERIES,
   doclist:{
  numFound:5,
  start:0,
  docs:[
 {
NAME:D-SERIES,
_version_:1505559209034383400
 }
  ]
   }
},
{
   groupValue:E-SERIES,
   doclist:{
  numFound:3,
  start:0,
  maxScore:1,
  docs:[
 {
NAME:E-SERIES,
_version_:1505559209034383400
 }
  ]
   }
}
 ]
  }
   }
}

I am facing same problem with Recip function to get latest record on some
date field when using sharding. It returns back records in wrong order.

Note: Same configuration works fine on single machine without sharding.

Please Help me to find solution.

Thanks.


Re: Solr 5.2.1 - SolrCloud create collection, core is only loaded after restart

2015-07-08 Thread Jens Brandt
Hi,

there was a problem with zookeeper and IPv6 that could be solved by using 
-Djava.net.preferIPv4Stack=true. Now, the core is correctly created, but I am 
wondering why I cannot see the core on the web interface, neither on the core 
admin screen nor in the Core Selector field. Only after restarting solr the 
core shows up on the web interface.

Best Regards,
  Jens


 Am 07.07.2015 um 12:49 schrieb Jens Brandt bra...@docoloc.de:
 
 Hi Erick,
 
 thanks for your reply.
 
 after creating the new collection via CollectionAPI I can see in the solr log 
 files that the core was created:
 
 Solr index directory '/var/lib/solr/gagel_shard1_replica1/data/index' doesn't 
 exist. Creating new index...
 
 However, when calling
 
 curl http://solrtest:8080/solr/gagel/query?q=*:*;
 
 I get an HTTP 404 error:
 
 html
 head
 meta http-equiv=Content-Type content=text/html; charset=UTF-8/
 titleError 404 Not Found/title
 /head
 bodyh2HTTP ERROR 404/h2
 pProblem accessing /solr/tubs/query. Reason:
 preNot Found/pre/phrismallPowered by Jetty:///small/ihr/
 /body
 /html
 
 
 
 
 Am 06.07.2015 um 19:51 schrieb Erick Erickson erickerick...@gmail.com:
 
 bq: However, the named core is created but not loaded in solr.
 
 I'm not quite sure what that means, what is the symptom you see?
 Because this should be fine. I've sometimes been fooled by looking
 at the core admin UI screen and not reloading it. What happens
 if you try querying your new collection directly right after you create
 it? e.g.
 
 http://blah blah/solr/gagel/query?q=*:*
 
 You should get back a valid packet. Admittedly with 0 hits, but if the
 core were truly not loaded you'd get an error.
 
 And please, please, please do NOT use the core admin screen to try
 to add cores in SolrCloud mode. It's possible to use, but you must
 know _exactly_ what parameters to set or Bad Things Happen.
 Continue to use the collections API, it's safer.
 
 Best,
 Erick
 
 
 On Mon, Jul 6, 2015 at 8:54 AM, Jens Brandt bra...@docoloc.de wrote:
 Hi,
 
 I am trying to setup SolrCloud with external zookeeper. Solr 5.2.1 is 
 running on host solrtest at port 8080 and zookeeper already contains a 
 config with the name customerSolr. When I create a new collection using the 
 CollectionAPI by calling the following url:
 
 http://solrtest:8080/solr/admin/collections?action=CREATEnumShards=1collection.configName=customerSolrname=gagel;
 
 I get a positive response and the core name gagel_shard1_replica1 is 
 returned. However, the named core is created but not loaded in solr. When I 
 trie to manually add the core by using the Core Admin webinterface I get 
 the error that the core already exists. After a restart of solr the core is 
 loaded correctly.
 
 Can anyone please advise if I am doing something wrong or maybe this is an 
 issue in solr 5.2.1?
 
 Best Regards,
  Jens
 



signature.asc
Description: Message signed with OpenPGP using GPGMail


Re: Tlog replay

2015-07-08 Thread Alessandro Benedetti
Hi Summer,

If you take a look to the CommitUpdateCommand class, you will notice no
Flag is in there.

// this is the toString for example

@Override
public String toString() {
  return super.toString() + ,optimize=+optimize
  +,openSearcher=+openSearcher
  +,waitSearcher=+waitSearcher
  +,expungeDeletes=+expungeDeletes
  +,softCommit=+softCommit
  +,prepareCommit=+prepareCommit
  +'}';
}


If you then access the UpdateCommand object, you find the flag :


public static int BUFFERING = 0x0001;// update command is
being buffered.
public static int REPLAY= 0x0002;// update command is from
replaying a log.
public static int PEER_SYNC= 0x0004; // update command is a
missing update being provided by a peer.
public static int IGNORE_AUTOCOMMIT = 0x0008; // this update
should not count toward triggering of autocommits.
public static int CLEAR_CACHES = 0x0010; // clear caches
associated with the update log.  used when applying reordered DBQ
updates when doing an add.

So the flag =2 is actually saying that the update command is from
replaying a log ( which is what you would expect)


Cheers


2015-07-08 3:01 GMT+01:00 Summer Shire shiresum...@gmail.com:


 Hi,

 When I restart my solr core the log replay starts and just before it
 finishes I see the following commit

 start
 commit{flags=2,optimize=false,openSearcher=true,waitSearcher=true,expungeDeletes=false,softCommit=false,prepareCommit=false}

 what does the “flags=2” param do ?

 when I try to send that param to the updateHandler manually solr does not
 like it

 curl http://localhost:6600/solr/main/update -H Content-Type: text/xml
 --data-binary 'commit  openSearcher=true flags=2
 waitSearcher=false/'

 ?xml version=1.0 encoding=UTF-8?
 response
 lst name=responseHeaderint name=status400/intint
 name=QTime0/int/lstlst name=errorstr name=msgUnknown commit
 parameter 'flags'/strint name=code400/int/lst
 /response

 thanks,
 Summer




-- 
--

Benedetti Alessandro
Visiting card : http://about.me/alessandro_benedetti

Tyger, tyger burning bright
In the forests of the night,
What immortal hand or eye
Could frame thy fearful symmetry?

William Blake - Songs of Experience -1794 England


Re: Indexed field to schema field

2015-07-08 Thread Alessandro Benedetti
I am really sorry Gajendra, but what do your latex mails mean ?
Why classic field boosting is not an option for you ?
Are you developing a custom query parser ?
What are the parameter expected for this query parser ? What is the
behaviour expected ?
It is really hard to help with such fragmented information.

Cheers
2015-07-08 11:42 GMT+01:00 Gajendra Dadheech gajju3...@gmail.com:

 At the time of forming this request i am not sure which kind of field that
 would be. So i read fields in new searcher.


 Thanks and regards,
 Gajendra Dadheech


 On Wed, Jul 8, 2015 at 2:12 PM, Gajendra Dadheech gajju3...@gmail.com
 wrote:

  I wish to do it in code so schema browser is lesser of an option.
 
  Use case is  : I wish to boost particular fields while matching, for that
  i need to know My field to Solr field mapping. SO that i can put that in
  the query.
 
 
  Thanks and regards,
  Gajendra Dadheech
 
 
  On Tue, Jul 7, 2015 at 9:23 PM, Erick Erickson erickerick...@gmail.com
  wrote:
 
  Feels like an XY problem. Why do you want to do this? What's
  the use-case? Perhaps there's an alternative approach that
  satisfies the need.
 
  Best,
  Erick
 
  On Tue, Jul 7, 2015 at 4:21 AM, Mikhail Khludnev
  mkhlud...@griddynamics.com wrote:
   Just an idea, Solr Admin/Schema Browser reports some info like this,
  hence,
   you can trace the way in which it does it.
  
   On Tue, Jul 7, 2015 at 10:34 AM, Gajendra Dadheech 
 gajju3...@gmail.com
  
   wrote:
  
   Hi,
  
   Can i some how translate fields which i read from
   newSearcher.getAtomicReader().fields(), to schema fields ? Does solr
  expose
   any method to do this translation ? Alternative approach i am
 thinking
  will
   involved lots of regex computation as the fields would be _string,
  _float
   etc and i would have to remove those suffixes, this becomes little
  tricky
   when fields are dynamic.
  
   Thanks and regards,
   Gajendra Dadheech
  
  
  
  
   --
   Sincerely yours
   Mikhail Khludnev
   Principal Engineer,
   Grid Dynamics
  
   http://www.griddynamics.com
   mkhlud...@griddynamics.com
 
 
 




-- 
--

Benedetti Alessandro
Visiting card : http://about.me/alessandro_benedetti

Tyger, tyger burning bright
In the forests of the night,
What immortal hand or eye
Could frame thy fearful symmetry?

William Blake - Songs of Experience -1794 England


Re: Remove operation of partial update doesn't work

2015-07-08 Thread Mohsen Saboorian
​
I use add and remove both on a multivalue field (think of tags on a blog
post). For this, set null won't work because I want only one value (tag) to
be removed
​, ​
and set null neither remove one nor all of values (all tags here).

So I use some
​S
olr
​J
 code which would translate to something like this:

{
id: docId,
​tagId: {remove: someTagId}
}

After commit, there is still taId: someTagId in my document.

​Here is my schema part for tagId:
field name=
​tagId
 type=int indexed=true stored=true multiValued=true /

Thanks,
Mohsen

On Wed, Jul 8, 2015 at 3:26 PM, Alessandro Benedetti 
benedetti.ale...@gmail.com wrote:

 In this scenarios, Documentation is key :

 Modifier

 Usage

 set

 Set or replace the field value(s) with the specified value(s), or *remove
 the values if 'null' or empty list is specified as the new value.*

 May be specified as a single value, or as a list for multivalued fields

 add

 Adds the specified values to a multivalued field.

 May be specified as a single value, or as a list.
 remove

 Removes (all occurrences of) the specified values from a multivalued field.

 May be specified as a single value, or as a list.
 removeregex

 Removes all occurrences of the specified regex from a multiValued field.

 May be specified as a single value, or as a list.

 inc

 Increments a numeric value by a specific amount.

 Must be specified as a single numeric value.

 In my opinion set is the right direction to look into.
 Not sure what happens if you use the remove to remove only a single valued
 field value.
 Can you explain us what you noticed ? An empty value remain for that field
 ?
 It is kind of weird, I would expect the field to become null.

 Cheers


 2015-07-08 10:34 GMT+01:00 Mohsen Saboorian mohs...@gmail.com:

  In my code when operation is add it works correctly on a multivalue
  field. But no multivalue field can be deleted with remove operation.
  The add operation adds a value to a multivaled field. The remove
 operation
  removes a value from a multivalued field.
 
  If you believe that something is not working, please state clearly why
 you
  believe that something is not working. Start by describing the symptom.
 
 
  -- Jack Krupansky
 
  On Mon, Jul 6, 2015 at 9:22 PM, Mohsen Saboorian mohs...@gmail.com
  wrote:
 
   I can partially 'add' fields to my Solr index, but 'remove' operation
  seems
   not working. I'm on Solr 4.10. Here is my SolrJ snippet:
  
   SolrInputDocument doc = new SolrInputDocument();
   MapString, Object partialUpdate = new HashMap();
   partialUpdate.put(operation, value); // value can be object
   (string, number, etc) or list. operation can be add, set or remove.
   doc.addField(id, id); // document id
   doc.addField(fieldName, partialUpdate);
  
   getSolrServer().add(doc, commitWithin);
  
   Is there anything wrong with my code?
  
 



 --
 --

 Benedetti Alessandro
 Visiting card : http://about.me/alessandro_benedetti

 Tyger, tyger burning bright
 In the forests of the night,
 What immortal hand or eye
 Could frame thy fearful symmetry?

 William Blake - Songs of Experience -1794 England



Synonym with Proximity search in solr 5.1.0

2015-07-08 Thread dinesh naik
Hi,

We have a synonym file with below content:

1
2
cell phone ,nokia mobile

And we have 3 documents:

doc1:

3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
  doc
field name=id1001/field
field name=nameDoc 1/field
field name=textI like nokia mobile /field
  /doc

doc2:

  doc
field name=id1002/field
field name=nameDoc 2/field
field name=textI cant leave without cell phone /field
  /doc

doc3:

  doc
field name=id1003/field
field name=nameDoc 3/field
field name=textI work with Nokia inc/field
  /doc

when i search for cell phone, I should get doc1 and doc2 returned but not
doc3.

The search syntax is : text: cell phone~500


How could i achieve this?



Best Regards,
Dinesh Naik


Re: Synonym with Proximity search in solr 5.1.0

2015-07-08 Thread Alessandro Benedetti
Showing your debug query would clarify the situation, but I assume you got
into a classic multi-word synonym problem[1] .
Hope the documents I pointed out are good for you.

Cheers

[1] http://nolanlawson.com/2012/10/31/better-synonym-handling-in-solr/
[2]
http://lucidworks.com/blog/solution-for-multi-term-synonyms-in-lucenesolr-using-the-auto-phrasing-tokenfilter/

2015-07-08 15:47 GMT+01:00 dinesh naik dineshkumarn...@gmail.com:

 Hi,

 We have a synonym file with below content:

 1
 2
 cell phone ,nokia mobile

 And we have 3 documents:

 doc1:

 3
 4
 5
 6
 7
 8
 9
 10
 11
 12
 13
 14
 15
 16
 17
 18
 19
 20
 21
 22
 23
 24
 25
 26
 27
 28
 29
 30
 31
 32
 33
 34
 35
 36
 37
 38
 39
 40
 41
 42
 43
 44
 45
 46
 47
 48
 49
 50
 51
 52
 53
 54
 55
 56
 57
 58
 59
 60
 61
 62
   doc
 field name=id1001/field
 field name=nameDoc 1/field
 field name=textI like nokia mobile /field
   /doc

 doc2:

   doc
 field name=id1002/field
 field name=nameDoc 2/field
 field name=textI cant leave without cell phone /field
   /doc

 doc3:

   doc
 field name=id1003/field
 field name=nameDoc 3/field
 field name=textI work with Nokia inc/field
   /doc

 when i search for cell phone, I should get doc1 and doc2 returned but not
 doc3.

 The search syntax is : text: cell phone~500


 How could i achieve this?



 Best Regards,
 Dinesh Naik




-- 
--

Benedetti Alessandro
Visiting card : http://about.me/alessandro_benedetti

Tyger, tyger burning bright
In the forests of the night,
What immortal hand or eye
Could frame thy fearful symmetry?

William Blake - Songs of Experience -1794 England


Re: Difference in WordDelimiterFilter behavior between 4.7.2 and 4.9.1

2015-07-08 Thread Alessandro Benedetti
Taking a look into the documentation I see this inconsistent orderings in
my opinion :

*Example:*

Concatenate word parts and number parts, but not word and number parts that
occur in the same token.
analyzer
  tokenizer class=solr.WhitespaceTokenizerFactory/
  filter class=solr.WordDelimiterFilterFactory catenateWords=1
catenateNumbers=1/
/analyzer

*In:* hot-spot 100+42 XL40

*Tokenizer to Filter:* hot-spot(1), 100+42(2), XL40(3)

*Out:* hot(1), spot(2), hotspot(2) *(1?)*, 100(3), 42(4),
10042(4) *(2?)*, XL(5)*(3?)*, 40(6)*(4?)*

*Example:*

Concatenate all. Word and/or number parts are joined together.
analyzer
  tokenizer class=solr.WhitespaceTokenizerFactory/
  filter class=solr.WordDelimiterFilterFactory catenateAll=1/
/analyzer

*In:* XL-4000/ES

*Tokenizer to Filter:* XL-4000/ES(1)

*Out:* XL(1), 4000(2), ES(3), XL4000ES(3)*(1?)*


I have not clear why a token generated by a catenation should not occupy
the same position of the original one.


In your example , I am a little bit surprised of the first results as well :

RRR-COLECCION: COLECCIÓN: Gracita Morales foobar

Here are the final positions and terms that 4.7.2 yields for this on
query analysis:

1 rrr-coleccion
1 rrr
2 coleccion
2 rrrcoleccion *(1) ?*
3 coleccion
4 gracita
5 morales
6 foobar


It is not so clear, if the tokens must simply inherit their position from
the parent token, or if they must arrange it based on the final list of
tokens .

2015-07-08 16:03 GMT+01:00 Shawn Heisey apa...@elyograg.org:

 On 7/8/2015 8:44 AM, Shawn Heisey wrote:
  This is what 4.9.1 does with it:
 
  1 rrr-coleccion
  2 rrr
  2 coleccion
  2 rrrcoleccion
  3 coleccion
  4 gracita
  5 morales
  6 foobar

 Followup:  This is what Solr 5.2.1 does for query analysis, which also
 seems wrong, and doesn't match the phrase query:

 1 rrr-coleccion
 2 coleccion
 2 rrr
 2 rrrcoleccion
 3 coleccion
 4 gracita
 5 morales
 6 bleh

 The index analysis on 5.2.1 is the same as the other two versions.

 Thanks,
 Shawn




-- 
--

Benedetti Alessandro
Visiting card : http://about.me/alessandro_benedetti

Tyger, tyger burning bright
In the forests of the night,
What immortal hand or eye
Could frame thy fearful symmetry?

William Blake - Songs of Experience -1794 England


Re: Indexed field to schema field

2015-07-08 Thread Gajendra Dadheech
Sorry,thought this was common problem.

Will present with decoration in some time if not able to solve it by then.


Thanks and regards,
Gajendra Dadheech


On Wed, Jul 8, 2015 at 6:23 PM, Alessandro Benedetti 
benedetti.ale...@gmail.com wrote:

 I am really sorry Gajendra, but what do your latex mails mean ?
 Why classic field boosting is not an option for you ?
 Are you developing a custom query parser ?
 What are the parameter expected for this query parser ? What is the
 behaviour expected ?
 It is really hard to help with such fragmented information.

 Cheers
 2015-07-08 11:42 GMT+01:00 Gajendra Dadheech gajju3...@gmail.com:

  At the time of forming this request i am not sure which kind of field
 that
  would be. So i read fields in new searcher.
 
 
  Thanks and regards,
  Gajendra Dadheech
 
 
  On Wed, Jul 8, 2015 at 2:12 PM, Gajendra Dadheech gajju3...@gmail.com
  wrote:
 
   I wish to do it in code so schema browser is lesser of an option.
  
   Use case is  : I wish to boost particular fields while matching, for
 that
   i need to know My field to Solr field mapping. SO that i can put that
 in
   the query.
  
  
   Thanks and regards,
   Gajendra Dadheech
  
  
   On Tue, Jul 7, 2015 at 9:23 PM, Erick Erickson 
 erickerick...@gmail.com
   wrote:
  
   Feels like an XY problem. Why do you want to do this? What's
   the use-case? Perhaps there's an alternative approach that
   satisfies the need.
  
   Best,
   Erick
  
   On Tue, Jul 7, 2015 at 4:21 AM, Mikhail Khludnev
   mkhlud...@griddynamics.com wrote:
Just an idea, Solr Admin/Schema Browser reports some info like this,
   hence,
you can trace the way in which it does it.
   
On Tue, Jul 7, 2015 at 10:34 AM, Gajendra Dadheech 
  gajju3...@gmail.com
   
wrote:
   
Hi,
   
Can i some how translate fields which i read from
newSearcher.getAtomicReader().fields(), to schema fields ? Does
 solr
   expose
any method to do this translation ? Alternative approach i am
  thinking
   will
involved lots of regex computation as the fields would be _string,
   _float
etc and i would have to remove those suffixes, this becomes little
   tricky
when fields are dynamic.
   
Thanks and regards,
Gajendra Dadheech
   
   
   
   
--
Sincerely yours
Mikhail Khludnev
Principal Engineer,
Grid Dynamics
   
http://www.griddynamics.com
mkhlud...@griddynamics.com
  
  
  
 



 --
 --

 Benedetti Alessandro
 Visiting card : http://about.me/alessandro_benedetti

 Tyger, tyger burning bright
 In the forests of the night,
 What immortal hand or eye
 Could frame thy fearful symmetry?

 William Blake - Songs of Experience -1794 England



Re: Difference in WordDelimiterFilter behavior between 4.7.2 and 4.9.1

2015-07-08 Thread Shawn Heisey
On 7/8/2015 8:44 AM, Shawn Heisey wrote:
 This is what 4.9.1 does with it:

 1 rrr-coleccion
 2 rrr
 2 coleccion
 2 rrrcoleccion
 3 coleccion
 4 gracita
 5 morales
 6 foobar

Followup:  This is what Solr 5.2.1 does for query analysis, which also
seems wrong, and doesn't match the phrase query:

1 rrr-coleccion
2 coleccion
2 rrr
2 rrrcoleccion
3 coleccion
4 gracita
5 morales
6 bleh

The index analysis on 5.2.1 is the same as the other two versions.

Thanks,
Shawn



Re: Search Handler Question

2015-07-08 Thread Paden
Awesome. This looks like a great resource. Thanks!




--
View this message in context: 
http://lucene.472066.n3.nabble.com/Search-Handler-Question-tp4216341p4216348.html
Sent from the Solr - User mailing list archive at Nabble.com.


Search Handler Question

2015-07-08 Thread Paden
Hello, 

I've been trying to tune my search handler to get some better search results
and I just have like a general question about the search handler. 

This being the first time I've designed/implemented a search engine I've
been told that other engines operate on a kind of layered search. By
layered I mean you can

1. Prioritize exact phrasing first
2. Return documents that contain an AND meaning that they just contain
both words not necessarily in that order.  prioritize these as second.
3. Return documents that hit OR meaning that one of the words appears.
and so on...

I guess my question is could you do this in Solr with a SINGLE query. Not
multiple. I've tested some queries with the + modifier and it seems to
only return the documents if it contains both words and no OR's or anything.
Which I suppose it should. But could you implement the a layered search
handler if you wanted to? 





--
View this message in context: 
http://lucene.472066.n3.nabble.com/Search-Handler-Question-tp4216341.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Search Handler Question

2015-07-08 Thread Alessandro Benedetti
You are actually describing the Edismax Query parser ( which does what you
quoted and even more) :

https://cwiki.apache.org/confluence/display/solr/The+Extended+DisMax+Query+Parser

Take a look there, probably with a little tuning this is going to be a good
fit for you.
If any additional questions come up, just let us know,


Cheers

2015-07-08 15:26 GMT+01:00 Paden rumsey...@gmail.com:

 Hello,

 I've been trying to tune my search handler to get some better search
 results
 and I just have like a general question about the search handler.

 This being the first time I've designed/implemented a search engine I've
 been told that other engines operate on a kind of layered search. By
 layered I mean you can

 1. Prioritize exact phrasing first
 2. Return documents that contain an AND meaning that they just contain
 both words not necessarily in that order.  prioritize these as second.
 3. Return documents that hit OR meaning that one of the words appears.
 and so on...

 I guess my question is could you do this in Solr with a SINGLE query. Not
 multiple. I've tested some queries with the + modifier and it seems to
 only return the documents if it contains both words and no OR's or
 anything.
 Which I suppose it should. But could you implement the a layered search
 handler if you wanted to?





 --
 View this message in context:
 http://lucene.472066.n3.nabble.com/Search-Handler-Question-tp4216341.html
 Sent from the Solr - User mailing list archive at Nabble.com.




-- 
--

Benedetti Alessandro
Visiting card : http://about.me/alessandro_benedetti

Tyger, tyger burning bright
In the forests of the night,
What immortal hand or eye
Could frame thy fearful symmetry?

William Blake - Songs of Experience -1794 England


Re: Remove operation of partial update doesn't work

2015-07-08 Thread Won June Tai
I’d like to unsubscribe please.

 On Jul 8, 2015, at 11:01 AM, Alessandro Benedetti 
 benedetti.ale...@gmail.com wrote:
 
 I just tried on my own, and it is working perfectly.
 Stupid question, have you committed after your update?
 
 Cheers
 
 2015-07-08 15:41 GMT+01:00 Mohsen Saboorian mohs...@gmail.com:
 
 ​
 I use add and remove both on a multivalue field (think of tags on a blog
 post). For this, set null won't work because I want only one value (tag) to
 be removed
 ​, ​
 and set null neither remove one nor all of values (all tags here).
 
 So I use some
 ​S
 olr
 ​J
 code which would translate to something like this:
 
 {
 id: docId,
 ​tagId: {remove: someTagId}
 }
 
 After commit, there is still taId: someTagId in my document.
 
 ​Here is my schema part for tagId:
 field name=
 ​tagId
  type=int indexed=true stored=true multiValued=true /
 
 Thanks,
 Mohsen
 
 On Wed, Jul 8, 2015 at 3:26 PM, Alessandro Benedetti 
 benedetti.ale...@gmail.com wrote:
 
 In this scenarios, Documentation is key :
 
 Modifier
 
 Usage
 
 set
 
 Set or replace the field value(s) with the specified value(s), or *remove
 the values if 'null' or empty list is specified as the new value.*
 
 May be specified as a single value, or as a list for multivalued fields
 
 add
 
 Adds the specified values to a multivalued field.
 
 May be specified as a single value, or as a list.
 remove
 
 Removes (all occurrences of) the specified values from a multivalued
 field.
 
 May be specified as a single value, or as a list.
 removeregex
 
 Removes all occurrences of the specified regex from a multiValued field.
 
 May be specified as a single value, or as a list.
 
 inc
 
 Increments a numeric value by a specific amount.
 
 Must be specified as a single numeric value.
 
 In my opinion set is the right direction to look into.
 Not sure what happens if you use the remove to remove only a single
 valued
 field value.
 Can you explain us what you noticed ? An empty value remain for that
 field
 ?
 It is kind of weird, I would expect the field to become null.
 
 Cheers
 
 
 2015-07-08 10:34 GMT+01:00 Mohsen Saboorian mohs...@gmail.com:
 
 In my code when operation is add it works correctly on a multivalue
 field. But no multivalue field can be deleted with remove operation.
 The add operation adds a value to a multivaled field. The remove
 operation
 removes a value from a multivalued field.
 
 If you believe that something is not working, please state clearly why
 you
 believe that something is not working. Start by describing the symptom.
 
 
 -- Jack Krupansky
 
 On Mon, Jul 6, 2015 at 9:22 PM, Mohsen Saboorian mohs...@gmail.com
 wrote:
 
 I can partially 'add' fields to my Solr index, but 'remove' operation
 seems
 not working. I'm on Solr 4.10. Here is my SolrJ snippet:
 
SolrInputDocument doc = new SolrInputDocument();
MapString, Object partialUpdate = new HashMap();
partialUpdate.put(operation, value); // value can be
 object
 (string, number, etc) or list. operation can be add, set or remove.
doc.addField(id, id); // document id
doc.addField(fieldName, partialUpdate);
 
getSolrServer().add(doc, commitWithin);
 
 Is there anything wrong with my code?
 
 
 
 
 
 --
 --
 
 Benedetti Alessandro
 Visiting card : http://about.me/alessandro_benedetti
 
 Tyger, tyger burning bright
 In the forests of the night,
 What immortal hand or eye
 Could frame thy fearful symmetry?
 
 William Blake - Songs of Experience -1794 England
 
 
 
 
 
 -- 
 --
 
 Benedetti Alessandro
 Visiting card : http://about.me/alessandro_benedetti
 
 Tyger, tyger burning bright
 In the forests of the night,
 What immortal hand or eye
 Could frame thy fearful symmetry?
 
 William Blake - Songs of Experience -1794 England



Difference in WordDelimiterFilter behavior between 4.7.2 and 4.9.1

2015-07-08 Thread Shawn Heisey
I'm not sure if this is a bug, but it does break searches that work fine
in 4.7.2if we put the same config and index on 4.9.1.

Here's a slightly redacted bit of text that's been sent to the index,
and is also used as a phrase query:

RRR-COLECCION: COLECCIÓN: Gracita Morales foobar

Here are the final positions and terms that 4.7.2 yields for this on
query analysis:

1 rrr-coleccion
1 rrr
2 coleccion
2 rrrcoleccion
3 coleccion
4 gracita
5 morales
6 foobar

This is what 4.9.1 does with it:

1 rrr-coleccion
2 rrr
2 coleccion
2 rrrcoleccion
3 coleccion
4 gracita
5 morales
6 foobar

In both versions, this is what the index analysis generates:

1 rrr
2 coleccion
3 coleccion
4 gracita
5 morales
6 bleh

Remember that it's a phrase query.  As you can see, only the query
analysis from 4.7.2 matches.  I'm not an expert, but the 4.9.1 WDF
position output seems wrong.

The difference in these positions happens on the WordDelimiterFilter
step.  I going to try my fieldType on the 5.2.1 to example to see what
it does, see if maybe the problem has already been fixed. 
Unfortunately, due to a third-party component that has not been tested
with anything newer, I cannot upgrade beyond 4.9.1 at this time.

This is the fieldType present in both versions.  The 4.7 config has a
luceneMatchVersion of LUCENE_47, the 4.9.1 has LUCENE_4_9.

fieldType name=genText class=solr.TextField
sortMissingLast=true positionIncrementGap=100
  analyzer type=index
tokenizer class=solr.ICUTokenizerFactory
rulefiles=Latn:Latin-break-only-on-whitespace.rbbi/
filter class=solr.PatternReplaceFilterFactory
  pattern=^(\p{Punct}*)(.*?)(\p{Punct}*)$
  replacement=$2
/
filter class=solr.WordDelimiterFilterFactory
  splitOnCaseChange=1
  splitOnNumerics=1
  stemEnglishPossessive=1
  generateWordParts=1
  generateNumberParts=1
  catenateWords=1
  catenateNumbers=1
  catenateAll=0
  preserveOriginal=1
/
filter class=solr.ICUFoldingFilterFactory/
filter class=solr.CJKBigramFilterFactory outputUnigrams=true/
filter class=solr.LengthFilterFactory min=1 max=512/
  /analyzer
  analyzer type=query
tokenizer class=solr.ICUTokenizerFactory
rulefiles=Latn:Latin-break-only-on-whitespace.rbbi/
filter class=solr.PatternReplaceFilterFactory
  pattern=^(\p{Punct}*)(.*?)(\p{Punct}*)$
  replacement=$2
/
filter class=solr.WordDelimiterFilterFactory
  splitOnCaseChange=1
  splitOnNumerics=1
  stemEnglishPossessive=1
  generateWordParts=1
  generateNumberParts=1
  catenateWords=0
  catenateNumbers=0
  catenateAll=0
  preserveOriginal=0
/
filter class=solr.ICUFoldingFilterFactory/
filter class=solr.CJKBigramFilterFactory outputUnigrams=false/
filter class=solr.LengthFilterFactory min=1 max=512/
  /analyzer
/fieldType

Thanks,
Shawn



Re: Remove operation of partial update doesn't work

2015-07-08 Thread Alessandro Benedetti
I just tried on my own, and it is working perfectly.
Stupid question, have you committed after your update?

Cheers

2015-07-08 15:41 GMT+01:00 Mohsen Saboorian mohs...@gmail.com:

 ​
 I use add and remove both on a multivalue field (think of tags on a blog
 post). For this, set null won't work because I want only one value (tag) to
 be removed
 ​, ​
 and set null neither remove one nor all of values (all tags here).

 So I use some
 ​S
 olr
 ​J
  code which would translate to something like this:

 {
 id: docId,
 ​tagId: {remove: someTagId}
 }

 After commit, there is still taId: someTagId in my document.

 ​Here is my schema part for tagId:
 field name=
 ​tagId
  type=int indexed=true stored=true multiValued=true /

 Thanks,
 Mohsen

 On Wed, Jul 8, 2015 at 3:26 PM, Alessandro Benedetti 
 benedetti.ale...@gmail.com wrote:

  In this scenarios, Documentation is key :
 
  Modifier
 
  Usage
 
  set
 
  Set or replace the field value(s) with the specified value(s), or *remove
  the values if 'null' or empty list is specified as the new value.*
 
  May be specified as a single value, or as a list for multivalued fields
 
  add
 
  Adds the specified values to a multivalued field.
 
  May be specified as a single value, or as a list.
  remove
 
  Removes (all occurrences of) the specified values from a multivalued
 field.
 
  May be specified as a single value, or as a list.
  removeregex
 
  Removes all occurrences of the specified regex from a multiValued field.
 
  May be specified as a single value, or as a list.
 
  inc
 
  Increments a numeric value by a specific amount.
 
  Must be specified as a single numeric value.
 
  In my opinion set is the right direction to look into.
  Not sure what happens if you use the remove to remove only a single
 valued
  field value.
  Can you explain us what you noticed ? An empty value remain for that
 field
  ?
  It is kind of weird, I would expect the field to become null.
 
  Cheers
 
 
  2015-07-08 10:34 GMT+01:00 Mohsen Saboorian mohs...@gmail.com:
 
   In my code when operation is add it works correctly on a multivalue
   field. But no multivalue field can be deleted with remove operation.
   The add operation adds a value to a multivaled field. The remove
  operation
   removes a value from a multivalued field.
  
   If you believe that something is not working, please state clearly why
  you
   believe that something is not working. Start by describing the symptom.
  
  
   -- Jack Krupansky
  
   On Mon, Jul 6, 2015 at 9:22 PM, Mohsen Saboorian mohs...@gmail.com
   wrote:
  
I can partially 'add' fields to my Solr index, but 'remove' operation
   seems
not working. I'm on Solr 4.10. Here is my SolrJ snippet:
   
SolrInputDocument doc = new SolrInputDocument();
MapString, Object partialUpdate = new HashMap();
partialUpdate.put(operation, value); // value can be
 object
(string, number, etc) or list. operation can be add, set or remove.
doc.addField(id, id); // document id
doc.addField(fieldName, partialUpdate);
   
getSolrServer().add(doc, commitWithin);
   
Is there anything wrong with my code?
   
  
 
 
 
  --
  --
 
  Benedetti Alessandro
  Visiting card : http://about.me/alessandro_benedetti
 
  Tyger, tyger burning bright
  In the forests of the night,
  What immortal hand or eye
  Could frame thy fearful symmetry?
 
  William Blake - Songs of Experience -1794 England
 




-- 
--

Benedetti Alessandro
Visiting card : http://about.me/alessandro_benedetti

Tyger, tyger burning bright
In the forests of the night,
What immortal hand or eye
Could frame thy fearful symmetry?

William Blake - Songs of Experience -1794 England


unexpected hl.fragsize behavior

2015-07-08 Thread Philip Durbin
I'm seeing strange hl.fragsize behavior in the version of Solr 4.6.0,
the version I happen to be using.

I've been testing with this mp500.xml file...

http://svn.apache.org/viewvc/lucene/dev/tags/lucene_solr_4_6_0/solr/example/exampledocs/mp500.xml?view=markup

... using the query q=indication and I get some highlights:

```
$ curl -s 
http://localhost:8983/solr/collection1/select?wt=jsonindent=truehl=truehl.fl=*q=indication;
| jq '.highlighting'
{
  MA147LL/A: {
features: [
  , Battery level emindication/em
]
  }
}
```

Great! I got a highlight snippet back! But what if I start playing
with fragsize? According to
https://wiki.apache.org/solr/HighlightingParameters#hl.fragsize ,
fragsize=0 should give me the whole field value should be used with
no fragmenting. And it does:

```
$ curl -s 
http://localhost:8983/solr/collection1/select?wt=jsonindent=truehl=truehl.fl=*q=indicationhl.fragsize=0;
| jq '.highlighting'
{
  MA147LL/A: {
features: [
  Notes, Calendar, Phone book, Hold button, Date display, Photo
wallet, Built-in games, JPEG photo playback, Upgradeable firmware, USB
2.0 compatibility, Playback speed control, Rechargeable capability,
Battery level emindication/em
]
  }
}
```

As the docs indicate, fragsize=100 is the default and gives me the
same results as we saw above when we left out fragsize:

```
$ curl -s 
http://localhost:8983/solr/collection1/select?wt=jsonindent=truehl=truehl.fl=*q=indicationhl.fragsize=100;
| jq '.highlighting'
{
  MA147LL/A: {
features: [
  , Battery level emindication/em
]
  }
}
```

But wait a minute... fragsize is defined as the size, in characters,
of the snippets (aka fragments) created by the highlighter. Is that
really 100 characters? More like 27 if I strip out the HTML tags:

```
$ echo -n , Battery level emindication/em | awk '{gsub([^]*, )}1'
, Battery level indication
$ echo -n , Battery level emindication/em | awk
'{gsub([^]*, )}1' | wc -c
  27
```

So that's weird. I ask for 100 characters but only get 27?

Let's try asking for 110 characters:

```
$ curl -s 
http://localhost:8983/solr/collection1/select?wt=jsonindent=truehl=truehl.fl=*q=indicationhl.fragsize=110;
| jq '.highlighting'
{
  MA147LL/A: {
features: [
  , Upgradeable firmware, USB 2.0 compatibility, Playback speed
control, Rechargeable capability, Battery level emindication/em
]
  }
}
```

That's better. With fragsize=110 we got back a snippet of 121
characters that time. But why did we only get back 27 characters from
fragsize=100?

Here's something else that's strange. With fragsize=120 I get back
*fewer* characters than fragsize=110. Only 108 characters back rather
than 121:

```
$ curl -s 
http://localhost:8983/solr/collection1/select?wt=jsonindent=truehl=truehl.fl=*q=indicationhl.fragsize=120;
| jq '.highlighting'
{
  MA147LL/A: {
features: [
   firmware, USB 2.0 compatibility, Playback speed control,
Rechargeable capability, Battery level emindication/em
]
  }
}
```

As I increase the fragsize shouldn't I get *more* characters back? And
again, why do I only get 27 characters back from fragsize=100?

I'm concerned about this because my fix for
https://github.com/IQSS/dataverse/issues/2191 is to make fragsize
configurable, but I'm getting such unexpected results playing with
different fragsize values I'm losing faith in it. We use highlighting
heavily to indicate where in the document a query matched. To be
clear, I haven't lost faith in Solr itself. It's a great project. I'm
just trying to understand what's going on above.

Any advice is welcome!

Phil

p.s. In case it's more readable, I also posted this (long) email as a
gist: https://gist.github.com/pdurbin/1a7b55e5714b7424fa94

-- 
Philip Durbin
Software Developer for http://dataverse.org
http://www.iq.harvard.edu/people/philip-durbin


Re: Solr 5.2.1 - SolrCloud create collection, core is only loaded after restart

2015-07-08 Thread Erick Erickson
My _guess_ is that you're getting a cached page somehow and never
getting to Solr
at all when you don't see the new core. What happens if you look
at the admin UI from another machine? Or perhaps a different browser?

If you tail the Solr log when you are looking you should see the
request when you try to
see the new core, if you don't see a request come through then it's a
caching issue...

Or perhaps try issuing a core admin STATUS command?

http://solr:port/solr/admin/cores?action=STATUS

Best,
Erick

On Wed, Jul 8, 2015 at 6:14 AM, Jens Brandt bra...@docoloc.de wrote:
 Hi,

 there was a problem with zookeeper and IPv6 that could be solved by using 
 -Djava.net.preferIPv4Stack=true. Now, the core is correctly created, but I am 
 wondering why I cannot see the core on the web interface, neither on the core 
 admin screen nor in the Core Selector field. Only after restarting solr the 
 core shows up on the web interface.

 Best Regards,
   Jens


 Am 07.07.2015 um 12:49 schrieb Jens Brandt bra...@docoloc.de:

 Hi Erick,

 thanks for your reply.

 after creating the new collection via CollectionAPI I can see in the solr 
 log files that the core was created:

 Solr index directory '/var/lib/solr/gagel_shard1_replica1/data/index' 
 doesn't exist. Creating new index...

 However, when calling

 curl http://solrtest:8080/solr/gagel/query?q=*:*;

 I get an HTTP 404 error:

 html
 head
 meta http-equiv=Content-Type content=text/html; charset=UTF-8/
 titleError 404 Not Found/title
 /head
 bodyh2HTTP ERROR 404/h2
 pProblem accessing /solr/tubs/query. Reason:
 preNot Found/pre/phrismallPowered by 
 Jetty:///small/ihr/
 /body
 /html




 Am 06.07.2015 um 19:51 schrieb Erick Erickson erickerick...@gmail.com:

 bq: However, the named core is created but not loaded in solr.

 I'm not quite sure what that means, what is the symptom you see?
 Because this should be fine. I've sometimes been fooled by looking
 at the core admin UI screen and not reloading it. What happens
 if you try querying your new collection directly right after you create
 it? e.g.

 http://blah blah/solr/gagel/query?q=*:*

 You should get back a valid packet. Admittedly with 0 hits, but if the
 core were truly not loaded you'd get an error.

 And please, please, please do NOT use the core admin screen to try
 to add cores in SolrCloud mode. It's possible to use, but you must
 know _exactly_ what parameters to set or Bad Things Happen.
 Continue to use the collections API, it's safer.

 Best,
 Erick


 On Mon, Jul 6, 2015 at 8:54 AM, Jens Brandt bra...@docoloc.de wrote:
 Hi,

 I am trying to setup SolrCloud with external zookeeper. Solr 5.2.1 is 
 running on host solrtest at port 8080 and zookeeper already contains a 
 config with the name customerSolr. When I create a new collection using 
 the CollectionAPI by calling the following url:

 http://solrtest:8080/solr/admin/collections?action=CREATEnumShards=1collection.configName=customerSolrname=gagel;

 I get a positive response and the core name gagel_shard1_replica1 is 
 returned. However, the named core is created but not loaded in solr. When 
 I trie to manually add the core by using the Core Admin webinterface I get 
 the error that the core already exists. After a restart of solr the core 
 is loaded correctly.

 Can anyone please advise if I am doing something wrong or maybe this is an 
 issue in solr 5.2.1?

 Best Regards,
  Jens




Re: Remove operation of partial update doesn't work

2015-07-08 Thread Erick Erickson
Won June Tai:

Please follow the instructions here:
http://lucene.apache.org/solr/resources.html
search for unsubscribe.

You must use the _exact_ e-mail you used to subscribe. Also see the
problems link if it doesn't work the first time.

Best,
Erick

On Wed, Jul 8, 2015 at 8:03 AM, Won June Tai wonjune@gmail.com wrote:
 I’d like to unsubscribe please.

 On Jul 8, 2015, at 11:01 AM, Alessandro Benedetti 
 benedetti.ale...@gmail.com wrote:

 I just tried on my own, and it is working perfectly.
 Stupid question, have you committed after your update?

 Cheers

 2015-07-08 15:41 GMT+01:00 Mohsen Saboorian mohs...@gmail.com:


 I use add and remove both on a multivalue field (think of tags on a blog
 post). For this, set null won't work because I want only one value (tag) to
 be removed
 ,
 and set null neither remove one nor all of values (all tags here).

 So I use some
 S
 olr
 J
 code which would translate to something like this:

 {
 id: docId,
 tagId: {remove: someTagId}
 }

 After commit, there is still taId: someTagId in my document.

 Here is my schema part for tagId:
 field name=
 tagId
  type=int indexed=true stored=true multiValued=true /

 Thanks,
 Mohsen

 On Wed, Jul 8, 2015 at 3:26 PM, Alessandro Benedetti 
 benedetti.ale...@gmail.com wrote:

 In this scenarios, Documentation is key :

 Modifier

 Usage

 set

 Set or replace the field value(s) with the specified value(s), or *remove
 the values if 'null' or empty list is specified as the new value.*

 May be specified as a single value, or as a list for multivalued fields

 add

 Adds the specified values to a multivalued field.

 May be specified as a single value, or as a list.
 remove

 Removes (all occurrences of) the specified values from a multivalued
 field.

 May be specified as a single value, or as a list.
 removeregex

 Removes all occurrences of the specified regex from a multiValued field.

 May be specified as a single value, or as a list.

 inc

 Increments a numeric value by a specific amount.

 Must be specified as a single numeric value.

 In my opinion set is the right direction to look into.
 Not sure what happens if you use the remove to remove only a single
 valued
 field value.
 Can you explain us what you noticed ? An empty value remain for that
 field
 ?
 It is kind of weird, I would expect the field to become null.

 Cheers


 2015-07-08 10:34 GMT+01:00 Mohsen Saboorian mohs...@gmail.com:

 In my code when operation is add it works correctly on a multivalue
 field. But no multivalue field can be deleted with remove operation.
 The add operation adds a value to a multivaled field. The remove
 operation
 removes a value from a multivalued field.

 If you believe that something is not working, please state clearly why
 you
 believe that something is not working. Start by describing the symptom.


 -- Jack Krupansky

 On Mon, Jul 6, 2015 at 9:22 PM, Mohsen Saboorian mohs...@gmail.com
 wrote:

 I can partially 'add' fields to my Solr index, but 'remove' operation
 seems
 not working. I'm on Solr 4.10. Here is my SolrJ snippet:

SolrInputDocument doc = new SolrInputDocument();
MapString, Object partialUpdate = new HashMap();
partialUpdate.put(operation, value); // value can be
 object
 (string, number, etc) or list. operation can be add, set or remove.
doc.addField(id, id); // document id
doc.addField(fieldName, partialUpdate);

getSolrServer().add(doc, commitWithin);

 Is there anything wrong with my code?





 --
 --

 Benedetti Alessandro
 Visiting card : http://about.me/alessandro_benedetti

 Tyger, tyger burning bright
 In the forests of the night,
 What immortal hand or eye
 Could frame thy fearful symmetry?

 William Blake - Songs of Experience -1794 England





 --
 --

 Benedetti Alessandro
 Visiting card : http://about.me/alessandro_benedetti

 Tyger, tyger burning bright
 In the forests of the night,
 What immortal hand or eye
 Could frame thy fearful symmetry?

 William Blake - Songs of Experience -1794 England



Re: Synonym with Proximity search in solr 5.1.0

2015-07-08 Thread dinesh naik
Hi Alessandro,
I have gone through the above suggested links, but i am not able to achieve
the above expected result.

The issue here is , my searched text is a part of field 'text' .

field name=textI like nokia mobile /field

searched text: nokia mobile~500.

Best Regards,
Dinesh Naik

On Wed, Jul 8, 2015 at 8:36 PM, Alessandro Benedetti 
benedetti.ale...@gmail.com wrote:

 Showing your debug query would clarify the situation, but I assume you got
 into a classic multi-word synonym problem[1] .
 Hope the documents I pointed out are good for you.

 Cheers

 [1] http://nolanlawson.com/2012/10/31/better-synonym-handling-in-solr/
 [2]

 http://lucidworks.com/blog/solution-for-multi-term-synonyms-in-lucenesolr-using-the-auto-phrasing-tokenfilter/

 2015-07-08 15:47 GMT+01:00 dinesh naik dineshkumarn...@gmail.com:

  Hi,
 
  We have a synonym file with below content:
 
  1
  2
  cell phone ,nokia mobile
 
  And we have 3 documents:
 
  doc1:
 
  3
  4
  5
  6
  7
  8
  9
  10
  11
  12
  13
  14
  15
  16
  17
  18
  19
  20
  21
  22
  23
  24
  25
  26
  27
  28
  29
  30
  31
  32
  33
  34
  35
  36
  37
  38
  39
  40
  41
  42
  43
  44
  45
  46
  47
  48
  49
  50
  51
  52
  53
  54
  55
  56
  57
  58
  59
  60
  61
  62
doc
  field name=id1001/field
  field name=nameDoc 1/field
  field name=textI like nokia mobile /field
/doc
 
  doc2:
 
doc
  field name=id1002/field
  field name=nameDoc 2/field
  field name=textI cant leave without cell phone /field
/doc
 
  doc3:
 
doc
  field name=id1003/field
  field name=nameDoc 3/field
  field name=textI work with Nokia inc/field
/doc
 
  when i search for cell phone, I should get doc1 and doc2 returned but not
  doc3.
 
  The search syntax is : text: cell phone~500
 
 
  How could i achieve this?
 
 
 
  Best Regards,
  Dinesh Naik
 



 --
 --

 Benedetti Alessandro
 Visiting card : http://about.me/alessandro_benedetti

 Tyger, tyger burning bright
 In the forests of the night,
 What immortal hand or eye
 Could frame thy fearful symmetry?

 William Blake - Songs of Experience -1794 England




-- 
Best Regards,
Dinesh Naik


Can I instruct the Tika Entity Processor to skip the first page using the DIH?

2015-07-08 Thread Paden
Hello, I'm using the DIH to import some files from one of my local
directories. However, every single one of these files has the same first
page. So I want to skip that first page in order to optimize search. 

Can this be accomplished by an instruction within the dataimporthandler or,
if not, how could you do this? 



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Can-I-instruct-the-Tika-Entity-Processor-to-skip-the-first-page-using-the-DIH-tp4216373.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: solr 5 and schema.xml

2015-07-08 Thread Erick Erickson
bq: I've then read that solr 5 doesn't by default use the schema.xml file, but
is using a managed schema by default. Apparently, I can't alter the
schema.xml file (which I can't find) but now need to use a REST api.
However, since I'm using dynamic fields, I'm not sure if this is still
necessary.

Not at all, although it _is_ confusing. There's
1 classic, i.e. non-cloud Solr
2 SolrCloud
and the cross product of
1 standard (there should be a conf/schema.xml file to edit)
2 schemaless
3 managed schema

So you have six possible configurations, although whether you're
running in cloud mode or not the schemaless and managed schemas
are used identically.

So, which ones are you interested in? If you're running in SolrCloud
mode, you won't find any conf directory to edit, the files are stored in
Zookeeper and you must use the zkcli script with the upconfig command
to change them. Although for development there's a sweet little
IntelliJ plugin that lets you edit them directly...

Best,
Erick


On Wed, Jul 8, 2015 at 3:31 AM, Alexandre Rafalovitch
arafa...@gmail.com wrote:
 You have the choice. You can use dynamic schema and control it using
 API or use classic schema and control it explicitly via schema.xml.
 You control that when you create the schema by using different
 templates. It's just the default one is a dynamic schema.

 Also, dynamic fields is not the same as dynamic schema, but I think
 you knew that. You can use dynamic fields with either one of them.

 So, try something like this:
 bin/solr create_core -c classic_core -d basic_configs

 Regards,
 Alex.
 P.s. You still get some APIs even with classic schema. But that's more
 for overriding solrconfig.xml settings.

 
 Solr Analyzers, Tokenizers, Filters, URPs and even a newsletter:
 http://www.solr-start.com/


 On 8 July 2015 at 05:11, spleenboy paul.br...@neilltech.com wrote:
 Had a look at previous postings, but am still thoroughly confused.
 I installed Solr 5 out of the box, built a core and uploaded some
 documents using dynamic field types.
 I can see my uploaded docs using the get method. When I query those docs,
 results seem all over the place.
 The answer seems to be to alter my schema.xml file, but it doesn't appear to
 be in conf directory where everyone seems to be directing me to.
 I've then read that solr 5 doesn't by default use the schema.xml file, but
 is using a managed schema by default. Apparently, I can't alter the
 schema.xml file (which I can't find) but now need to use a REST api.
 However, since I'm using dynamic fields, I'm not sure if this is still
 necessary.
 I've hunted high and low for clear documentation on this, but am still
 confused. I need to build a single index based upon customer data, searching
 by email address.
 Any help, or pointing in the right direction to where this is clearly
 documented would be gratefully received.




 --
 View this message in context: 
 http://lucene.472066.n3.nabble.com/solr-5-and-schema-xml-tp4216290.html
 Sent from the Solr - User mailing list archive at Nabble.com.


Re: Remove operation of partial update doesn't work

2015-07-08 Thread Mohsen Saboorian
Can you post your solrj code?
در تاریخ 8 ژوئیهٔ 2015 19:32، Alessandro Benedetti 
benedetti.ale...@gmail.com نوشت:

 I just tried on my own, and it is working perfectly.
 Stupid question, have you committed after your update?

 Cheers

 2015-07-08 15:41 GMT+01:00 Mohsen Saboorian mohs...@gmail.com:

  ​
  I use add and remove both on a multivalue field (think of tags on a blog
  post). For this, set null won't work because I want only one value (tag)
 to
  be removed
  ​, ​
  and set null neither remove one nor all of values (all tags here).
 
  So I use some
  ​S
  olr
  ​J
   code which would translate to something like this:
 
  {
  id: docId,
  ​tagId: {remove: someTagId}
  }
 
  After commit, there is still taId: someTagId in my document.
 
  ​Here is my schema part for tagId:
  field name=
  ​tagId
   type=int indexed=true stored=true multiValued=true /
 
  Thanks,
  Mohsen
 
  On Wed, Jul 8, 2015 at 3:26 PM, Alessandro Benedetti 
  benedetti.ale...@gmail.com wrote:
 
   In this scenarios, Documentation is key :
  
   Modifier
  
   Usage
  
   set
  
   Set or replace the field value(s) with the specified value(s), or
 *remove
   the values if 'null' or empty list is specified as the new value.*
  
   May be specified as a single value, or as a list for multivalued fields
  
   add
  
   Adds the specified values to a multivalued field.
  
   May be specified as a single value, or as a list.
   remove
  
   Removes (all occurrences of) the specified values from a multivalued
  field.
  
   May be specified as a single value, or as a list.
   removeregex
  
   Removes all occurrences of the specified regex from a multiValued
 field.
  
   May be specified as a single value, or as a list.
  
   inc
  
   Increments a numeric value by a specific amount.
  
   Must be specified as a single numeric value.
  
   In my opinion set is the right direction to look into.
   Not sure what happens if you use the remove to remove only a single
  valued
   field value.
   Can you explain us what you noticed ? An empty value remain for that
  field
   ?
   It is kind of weird, I would expect the field to become null.
  
   Cheers
  
  
   2015-07-08 10:34 GMT+01:00 Mohsen Saboorian mohs...@gmail.com:
  
In my code when operation is add it works correctly on a multivalue
field. But no multivalue field can be deleted with remove
 operation.
The add operation adds a value to a multivaled field. The remove
   operation
removes a value from a multivalued field.
   
If you believe that something is not working, please state clearly
 why
   you
believe that something is not working. Start by describing the
 symptom.
   
   
-- Jack Krupansky
   
On Mon, Jul 6, 2015 at 9:22 PM, Mohsen Saboorian mohs...@gmail.com
wrote:
   
 I can partially 'add' fields to my Solr index, but 'remove'
 operation
seems
 not working. I'm on Solr 4.10. Here is my SolrJ snippet:

 SolrInputDocument doc = new SolrInputDocument();
 MapString, Object partialUpdate = new HashMap();
 partialUpdate.put(operation, value); // value can be
  object
 (string, number, etc) or list. operation can be add, set or remove.
 doc.addField(id, id); // document id
 doc.addField(fieldName, partialUpdate);

 getSolrServer().add(doc, commitWithin);

 Is there anything wrong with my code?

   
  
  
  
   --
   --
  
   Benedetti Alessandro
   Visiting card : http://about.me/alessandro_benedetti
  
   Tyger, tyger burning bright
   In the forests of the night,
   What immortal hand or eye
   Could frame thy fearful symmetry?
  
   William Blake - Songs of Experience -1794 England
  
 



 --
 --

 Benedetti Alessandro
 Visiting card : http://about.me/alessandro_benedetti

 Tyger, tyger burning bright
 In the forests of the night,
 What immortal hand or eye
 Could frame thy fearful symmetry?

 William Blake - Songs of Experience -1794 England



Re: Synonym with Proximity search in solr 5.1.0

2015-07-08 Thread Alessandro Benedetti
What do you mean ? Have you used the implemented plugins already ?
Can you show us the debugged query please ?

Cheers

2015-07-08 16:48 GMT+01:00 dinesh naik dineshkumarn...@gmail.com:

 Hi Alessandro,
 I have gone through the above suggested links, but i am not able to achieve
 the above expected result.

 The issue here is , my searched text is a part of field 'text' .

 field name=textI like nokia mobile /field

 searched text: nokia mobile~500.

 Best Regards,
 Dinesh Naik

 On Wed, Jul 8, 2015 at 8:36 PM, Alessandro Benedetti 
 benedetti.ale...@gmail.com wrote:

  Showing your debug query would clarify the situation, but I assume you
 got
  into a classic multi-word synonym problem[1] .
  Hope the documents I pointed out are good for you.
 
  Cheers
 
  [1] http://nolanlawson.com/2012/10/31/better-synonym-handling-in-solr/
  [2]
 
 
 http://lucidworks.com/blog/solution-for-multi-term-synonyms-in-lucenesolr-using-the-auto-phrasing-tokenfilter/
 
  2015-07-08 15:47 GMT+01:00 dinesh naik dineshkumarn...@gmail.com:
 
   Hi,
  
   We have a synonym file with below content:
  
   1
   2
   cell phone ,nokia mobile
  
   And we have 3 documents:
  
   doc1:
  
   3
   4
   5
   6
   7
   8
   9
   10
   11
   12
   13
   14
   15
   16
   17
   18
   19
   20
   21
   22
   23
   24
   25
   26
   27
   28
   29
   30
   31
   32
   33
   34
   35
   36
   37
   38
   39
   40
   41
   42
   43
   44
   45
   46
   47
   48
   49
   50
   51
   52
   53
   54
   55
   56
   57
   58
   59
   60
   61
   62
 doc
   field name=id1001/field
   field name=nameDoc 1/field
   field name=textI like nokia mobile /field
 /doc
  
   doc2:
  
 doc
   field name=id1002/field
   field name=nameDoc 2/field
   field name=textI cant leave without cell phone /field
 /doc
  
   doc3:
  
 doc
   field name=id1003/field
   field name=nameDoc 3/field
   field name=textI work with Nokia inc/field
 /doc
  
   when i search for cell phone, I should get doc1 and doc2 returned but
 not
   doc3.
  
   The search syntax is : text: cell phone~500
  
  
   How could i achieve this?
  
  
  
   Best Regards,
   Dinesh Naik
  
 
 
 
  --
  --
 
  Benedetti Alessandro
  Visiting card : http://about.me/alessandro_benedetti
 
  Tyger, tyger burning bright
  In the forests of the night,
  What immortal hand or eye
  Could frame thy fearful symmetry?
 
  William Blake - Songs of Experience -1794 England
 



 --
 Best Regards,
 Dinesh Naik




-- 
--

Benedetti Alessandro
Visiting card : http://about.me/alessandro_benedetti

Tyger, tyger burning bright
In the forests of the night,
What immortal hand or eye
Could frame thy fearful symmetry?

William Blake - Songs of Experience -1794 England


Re: Grouping and recip function not working with Sharding

2015-07-08 Thread Erick Erickson
From the reference guide:

group.ngroups and group.facet require that all documents in each group
must be co-located on the same shard in order for accurate counts to
be returned. Document routing via composite keys can be a useful
solution in many situations.

It's not clear what you think the prolbem here is. You say:
bq: Ex: Below response contains 5 groups (Which is correct) but
ngroups is 11. But you have rows set to 5 so?

As far as your sorting issue, again an example showing what you think
is wrong would be very helpful.

Best,
Erick



On Wed, Jul 8, 2015 at 6:38 AM, Pankaj Sonawane
pankaj4sonaw...@gmail.com wrote:
 Hi,

 I am using sharding (3 shards) with Zookeeper.

 When I query a collection using 
 *group=truegroup.field=NAMEgroup.ngroups=true* parameters, *ngroups* in
 response is incorrect. However I am getting correct count in doclist array.

 Ex: Below response contains 5 groups (Which is correct) but ngroups is 11.

 {
responseHeader:{
   status:0,
   QTime:49,
   params:{
  group.ngroups:true,
  indent:true,
  start:0,
  q:*:*,
  group.field:NAME,
  group:true,
  wt:json,
  rows:5
   }
},
grouped:{
   NAME:{
  matches:18,
  ngroups:11,
  groups:[
 {
groupValue:A-SERIES,
doclist:{
   numFound:5,
   start:0,
   maxScore:1,
   docs:[
  {
 NAME:A-SERIES,
 _version_:1505559209034383400
  }
   ]
}
 },
 {
groupValue:B-SERIES,
doclist:{
   numFound:5,
   start:0,
   docs:[
  {
 NAME:B-SERIES,
 _version_:1505559209034383400
  }
   ]
}
 },
 {
groupValue:C-SERIES,
doclist:{
   numFound:1,
   start:0,
   docs:[
  {
 NAME:C-SERIES,
 _version_:1505559209034383400
  }
   ]
}
 },
 {
groupValue:D-SERIES,
doclist:{
   numFound:5,
   start:0,
   docs:[
  {
 NAME:D-SERIES,
 _version_:1505559209034383400
  }
   ]
}
 },
 {
groupValue:E-SERIES,
doclist:{
   numFound:3,
   start:0,
   maxScore:1,
   docs:[
  {
 NAME:E-SERIES,
 _version_:1505559209034383400
  }
   ]
}
 }
  ]
   }
}
 }

 I am facing same problem with Recip function to get latest record on some
 date field when using sharding. It returns back records in wrong order.

 Note: Same configuration works fine on single machine without sharding.

 Please Help me to find solution.

 Thanks.


Re: Remove operation of partial update doesn't work

2015-07-08 Thread Mohsen Saboorian
Yes I did. I use commitWithin to commit after a fixed timeout. Moreover my
add operation works!
در تاریخ 8 ژوئیهٔ 2015 19:32، Alessandro Benedetti 
benedetti.ale...@gmail.com نوشت:

 I just tried on my own, and it is working perfectly.
 Stupid question, have you committed after your update?

 Cheers

 2015-07-08 15:41 GMT+01:00 Mohsen Saboorian mohs...@gmail.com:

  ​
  I use add and remove both on a multivalue field (think of tags on a blog
  post). For this, set null won't work because I want only one value (tag)
 to
  be removed
  ​, ​
  and set null neither remove one nor all of values (all tags here).
 
  So I use some
  ​S
  olr
  ​J
   code which would translate to something like this:
 
  {
  id: docId,
  ​tagId: {remove: someTagId}
  }
 
  After commit, there is still taId: someTagId in my document.
 
  ​Here is my schema part for tagId:
  field name=
  ​tagId
   type=int indexed=true stored=true multiValued=true /
 
  Thanks,
  Mohsen
 
  On Wed, Jul 8, 2015 at 3:26 PM, Alessandro Benedetti 
  benedetti.ale...@gmail.com wrote:
 
   In this scenarios, Documentation is key :
  
   Modifier
  
   Usage
  
   set
  
   Set or replace the field value(s) with the specified value(s), or
 *remove
   the values if 'null' or empty list is specified as the new value.*
  
   May be specified as a single value, or as a list for multivalued fields
  
   add
  
   Adds the specified values to a multivalued field.
  
   May be specified as a single value, or as a list.
   remove
  
   Removes (all occurrences of) the specified values from a multivalued
  field.
  
   May be specified as a single value, or as a list.
   removeregex
  
   Removes all occurrences of the specified regex from a multiValued
 field.
  
   May be specified as a single value, or as a list.
  
   inc
  
   Increments a numeric value by a specific amount.
  
   Must be specified as a single numeric value.
  
   In my opinion set is the right direction to look into.
   Not sure what happens if you use the remove to remove only a single
  valued
   field value.
   Can you explain us what you noticed ? An empty value remain for that
  field
   ?
   It is kind of weird, I would expect the field to become null.
  
   Cheers
  
  
   2015-07-08 10:34 GMT+01:00 Mohsen Saboorian mohs...@gmail.com:
  
In my code when operation is add it works correctly on a multivalue
field. But no multivalue field can be deleted with remove
 operation.
The add operation adds a value to a multivaled field. The remove
   operation
removes a value from a multivalued field.
   
If you believe that something is not working, please state clearly
 why
   you
believe that something is not working. Start by describing the
 symptom.
   
   
-- Jack Krupansky
   
On Mon, Jul 6, 2015 at 9:22 PM, Mohsen Saboorian mohs...@gmail.com
wrote:
   
 I can partially 'add' fields to my Solr index, but 'remove'
 operation
seems
 not working. I'm on Solr 4.10. Here is my SolrJ snippet:

 SolrInputDocument doc = new SolrInputDocument();
 MapString, Object partialUpdate = new HashMap();
 partialUpdate.put(operation, value); // value can be
  object
 (string, number, etc) or list. operation can be add, set or remove.
 doc.addField(id, id); // document id
 doc.addField(fieldName, partialUpdate);

 getSolrServer().add(doc, commitWithin);

 Is there anything wrong with my code?

   
  
  
  
   --
   --
  
   Benedetti Alessandro
   Visiting card : http://about.me/alessandro_benedetti
  
   Tyger, tyger burning bright
   In the forests of the night,
   What immortal hand or eye
   Could frame thy fearful symmetry?
  
   William Blake - Songs of Experience -1794 England
  
 



 --
 --

 Benedetti Alessandro
 Visiting card : http://about.me/alessandro_benedetti

 Tyger, tyger burning bright
 In the forests of the night,
 What immortal hand or eye
 Could frame thy fearful symmetry?

 William Blake - Songs of Experience -1794 England



Re: Tlog replay

2015-07-08 Thread Summer Shire
Thanks Alessandro !

Any idea on why I couldn't curl the solr core and pass the flag param ?


 On Jul 8, 2015, at 7:12 AM, Alessandro Benedetti benedetti.ale...@gmail.com 
 wrote:
 
 Hi Summer,
 
 If you take a look to the CommitUpdateCommand class, you will notice no
 Flag is in there.
 
 // this is the toString for example
 
 @Override
 public String toString() {
  return super.toString() + ,optimize=+optimize
  +,openSearcher=+openSearcher
  +,waitSearcher=+waitSearcher
  +,expungeDeletes=+expungeDeletes
  +,softCommit=+softCommit
  +,prepareCommit=+prepareCommit
  +'}';
 }
 
 
 If you then access the UpdateCommand object, you find the flag :
 
 
 public static int BUFFERING = 0x0001;// update command is
 being buffered.
 public static int REPLAY= 0x0002;// update command is from
 replaying a log.
 public static int PEER_SYNC= 0x0004; // update command is a
 missing update being provided by a peer.
 public static int IGNORE_AUTOCOMMIT = 0x0008; // this update
 should not count toward triggering of autocommits.
 public static int CLEAR_CACHES = 0x0010; // clear caches
 associated with the update log.  used when applying reordered DBQ
 updates when doing an add.
 
 So the flag =2 is actually saying that the update command is from
 replaying a log ( which is what you would expect)
 
 
 Cheers
 
 
 2015-07-08 3:01 GMT+01:00 Summer Shire shiresum...@gmail.com:
 
 
 Hi,
 
 When I restart my solr core the log replay starts and just before it
 finishes I see the following commit
 
 start
 commit{flags=2,optimize=false,openSearcher=true,waitSearcher=true,expungeDeletes=false,softCommit=false,prepareCommit=false}
 
 what does the “flags=2” param do ?
 
 when I try to send that param to the updateHandler manually solr does not
 like it
 
 curl http://localhost:6600/solr/main/update -H Content-Type: text/xml
 --data-binary 'commit  openSearcher=true flags=2
 waitSearcher=false/'
 
 ?xml version=1.0 encoding=UTF-8?
 response
 lst name=responseHeaderint name=status400/intint
 name=QTime0/int/lstlst name=errorstr name=msgUnknown commit
 parameter 'flags'/strint name=code400/int/lst
 /response
 
 thanks,
 Summer
 
 
 
 
 -- 
 --
 
 Benedetti Alessandro
 Visiting card : http://about.me/alessandro_benedetti
 
 Tyger, tyger burning bright
 In the forests of the night,
 What immortal hand or eye
 Could frame thy fearful symmetry?
 
 William Blake - Songs of Experience -1794 England


Running Solr 5.2.1 on WIndows using NSSM

2015-07-08 Thread Adrian Liew
Hi guys,

I am looking to run Apache Solr v5.2.1 on a windows machine. I tried to setup a 
windows service using NSSM (Non-Sucking-Service-Manager) to install the windows 
service on the machine pointing to the solr.cmd file path itself and installing 
the service.

After installation, I tried to start the windows service but it gives back an 
alert message. It says \Windows could not start the SolrService service on 
Local Computer. The service  did not return an error. This could be an internal 
Windows error or an internal service error.

Most of the examples of older Apache Solr uses the java -start start.jar 
command to run Solr and seem to run okay with nssm. I am not sure if this could 
be the solr.cmd issue or NSSM's issue.

Alternatively, I have tried to use Windows Task Scheduler to configure a task 
to point to the solr.cmd as well and run task whenever the computer starts 
(regardless a user is logged in or not). The task scheduler seems to report 
back 'Task Start Failed' with Level of 'Error'.

Additionally, after checking Event Viewer, it returns the error with nssm 
Failed to open process handle for process with PID 3640 when terminating 
service Solr Service : The parameter is incorrect.

Chances this can point back to the solr.cmd file itself.

Thoughts?

Regards,
Adrian



Re: Indexed field to schema field

2015-07-08 Thread Gajendra Dadheech
At the time of forming this request i am not sure which kind of field that
would be. So i read fields in new searcher.


Thanks and regards,
Gajendra Dadheech


On Wed, Jul 8, 2015 at 2:12 PM, Gajendra Dadheech gajju3...@gmail.com
wrote:

 I wish to do it in code so schema browser is lesser of an option.

 Use case is  : I wish to boost particular fields while matching, for that
 i need to know My field to Solr field mapping. SO that i can put that in
 the query.


 Thanks and regards,
 Gajendra Dadheech


 On Tue, Jul 7, 2015 at 9:23 PM, Erick Erickson erickerick...@gmail.com
 wrote:

 Feels like an XY problem. Why do you want to do this? What's
 the use-case? Perhaps there's an alternative approach that
 satisfies the need.

 Best,
 Erick

 On Tue, Jul 7, 2015 at 4:21 AM, Mikhail Khludnev
 mkhlud...@griddynamics.com wrote:
  Just an idea, Solr Admin/Schema Browser reports some info like this,
 hence,
  you can trace the way in which it does it.
 
  On Tue, Jul 7, 2015 at 10:34 AM, Gajendra Dadheech gajju3...@gmail.com
 
  wrote:
 
  Hi,
 
  Can i some how translate fields which i read from
  newSearcher.getAtomicReader().fields(), to schema fields ? Does solr
 expose
  any method to do this translation ? Alternative approach i am thinking
 will
  involved lots of regex computation as the fields would be _string,
 _float
  etc and i would have to remove those suffixes, this becomes little
 tricky
  when fields are dynamic.
 
  Thanks and regards,
  Gajendra Dadheech
 
 
 
 
  --
  Sincerely yours
  Mikhail Khludnev
  Principal Engineer,
  Grid Dynamics
 
  http://www.griddynamics.com
  mkhlud...@griddynamics.com





Re: Adding field to query result

2015-07-08 Thread Erik Hatcher
Maya - where’s the variable come from?   

You can compute a “pseudo-field”, something like this:

$ bin/solr create -c test
$ bin/post -c test -type text/csv -out yes -d $'id,type,price_td\n1,Toys,55.00’
$ open 
http://localhost:8983/solr/test/select?q=*:*wt=xmlfl=id,type,price_td,sale_price:product(price_td,0.9)

Note that I used price_td, otherwise the field type guessing will make it 
multivalued and not suitable for functions like that.

Does that help?   Or maybe you’re interested in something like Solr’s 
ExternalFileField?

—
Erik Hatcher, Senior Solutions Architect
http://www.lucidworks.com




 On Jul 8, 2015, at 2:03 PM, Maya G maiki...@gmail.com wrote:
 
 Hello,
 
 I'm using solr 4.10.
 I'd like to know if it is possible to add a field only on query response and
 calculate its' value for the specific query.
 
 For example:
 Assume this is the document.
 doc
 Id1/Id
 TypeToys/Type
 Price55/Price
 /doc
 
 I would like the response to contain another field which its' value to be
 calculates from the value of 'price' and a given variable.
 The field's value can be change from query to query and shouldn't be
 indexed.
 
 Is it possible to run a query and create a new field 'price_sale' on
 runtime?
 
 Thanks in advance,
 Maya
 Is there a way to do this on solr? 
 
 
 
 --
 View this message in context: 
 http://lucene.472066.n3.nabble.com/Adding-field-to-query-result-tp4216396.html
 Sent from the Solr - User mailing list archive at Nabble.com.



Re: Solr Encoding Issue?

2015-07-08 Thread Erick Erickson
Attachments are pretty aggressively stripped by the e-mail server, so
there's nothing to see,
you'll have to paste it somewhere else and provide a link.

Usually, though, this is a character set issue with the browser using a
different charset than
Solr, it's really the same character, just displayed differently.

Shot in the dark though.

Erick

On Wed, Jul 8, 2015 at 10:49 AM, Tarala, Magesh mtar...@bh.com wrote:

  I’m ingesting a .TXT file with HTML content into Solr. The content has
 the following character highlighted below:

 The file we get from CRM (also attached):

 [image: cid:image001.png@01D0B972.75BE23F0]





 After ingesting into solr, I see a different character. This is query
 response from solr management console.



 [image: cid:image003.png@01D0B972.D1AED290]





 Anybody know how I can prevent this from happening?



 Thanks!



RE: Solr Encoding Issue?

2015-07-08 Thread Tarala, Magesh
Looks like images did not come through. Here's the text...


I'm ingesting a .TXT file with HTML content into Solr. The content has the 
following character highlighted below:
The file we get from CRM (also attached):
td align=center style=text-align:center;font size=3span 
style=font-size:12pt;bEnter Data in TK Only/bfont 
face=Wingdingsbà/b/font/span/font/td



After ingesting into solr, I see a different character. This is query response 
from solr management console.
td align=\center\ style=\text-align:center;\font size=\3\span 
style=\font-size:12pt;\bEnter Data in TK Only/bfont 
face=\Wingdings\bà /b/font/span/font/td



I'm expecting to see bà/b
But I'm seeing bà /b

Anybody know how I can prevent this from happening?

Thanks!


Adding field to query result

2015-07-08 Thread Maya G
Hello,

I'm using solr 4.10.
I'd like to know if it is possible to add a field only on query response and
calculate its' value for the specific query.

For example:
Assume this is the document.
doc
Id1/Id
TypeToys/Type
Price55/Price
/doc

I would like the response to contain another field which its' value to be
calculates from the value of 'price' and a given variable.
The field's value can be change from query to query and shouldn't be
indexed.

Is it possible to run a query and create a new field 'price_sale' on
runtime?

Thanks in advance,
Maya
Is there a way to do this on solr? 



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Adding-field-to-query-result-tp4216396.html
Sent from the Solr - User mailing list archive at Nabble.com.


Solr Encoding Issue?

2015-07-08 Thread Tarala, Magesh
I'm ingesting a .TXT file with HTML content into Solr. The content has the 
following character highlighted below:
The file we get from CRM (also attached):
[cid:image001.png@01D0B972.75BE23F0]


After ingesting into solr, I see a different character. This is query response 
from solr management console.

[cid:image003.png@01D0B972.D1AED290]


Anybody know how I can prevent this from happening?

Thanks!


Solr cache when using custom scoring

2015-07-08 Thread amid
Hi,

We are using solr and implemented our own custom scoring.
The custom scoring code use a parameter which passed to the solr query,
different parameter value will change the score of the same query.

The problem which we have is that this parameter is not part of the query
caching so running the same query with different parameter values return the
first cached result.

What is the best way to workaround it (without removing the cache)? Is there
a way to tell solr to cache query with the parameter value as well? or maybe
add a dummy query to the query (the parameter is pretty long json)?

Thanks,
Ami



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Solr-cache-when-using-custom-scoring-tp4216419.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Difference in WordDelimiterFilter behavior between 4.7.2 and 4.9.1

2015-07-08 Thread Shawn Heisey
On 7/8/2015 9:26 AM, Alessandro Benedetti wrote:
 Taking a look into the documentation I see this inconsistent orderings in
 my opinion :

Alessandro, thank you for your reply.  I couldn't really tell what you
were saying.  I *think* you were agreeing with me that the current
behavior seems like a problem, but I'm not really sure.

At this point I think I should probably file a bug in Jira ... anyone
have any thoughts on that?

Thanks,
Shawn



Re: Difference in WordDelimiterFilter behavior between 4.7.2 and 4.9.1

2015-07-08 Thread Shawn Heisey
On 7/8/2015 2:10 PM, Shawn Heisey wrote:
 At this point I think I should probably file a bug in Jira ... anyone
 have any thoughts on that?

It appears that changing luceneMatchVersion from LUCENE_4_9 to LUCENE_47
has fixed this problem ... so I think somebody must have fixed WDF to
its current behavior, but put in a version check for the old behavior.

I think that WDF's position output with a current luceneMatchVersion is
wrong, but I'd like the input of someone who's a little more familiar
with the codeand what SHOULD happen.

Thanks,
Shawn



SolrQueryRequest in SolrCloud vs Standalone Solr

2015-07-08 Thread Chetan Vora
Hi all

We have a cluster of standalone Solr cores (Solr 4.3) for which we had
built  some custom requesthandlers and filters which do query processing
using the Terms API. I'm now trying to port the custom functionality to
work in the Solr Cloud world.

Old configuration had standalone cores with the requesthandler embedded
into each:

core1

- requesthandler plugin

core2

- requesthandler plugin

We built an exernal (non-Solr) component that sent every query request to
each core and aggregrated the results. When processing the request, within
each request handler, it obtained a index searcher by doing

SolrIndexSearcher searcher = solrQueryRequest.getSearcher();

followed by

searcher.search()...

Request1: http://localhost:xxx/solr/core1/plugin?q=blahblah

Request2: http://localhost:xxx/solr/core2/plugin?q=blahblah


In the SolrCloud version, I expected things to work similarly but at the
collection level.

New configuration:

SolrCloud collection with plugin

- shard1

- shard2

So my expectation is when I invoke

SolrIndexSearcher searcher = solrQueryRequest.getSearcher() ...


I obtain a searcher which can search against the collection i.e against all
the shards. But this doesn't seem to happen. It seems that the searcher is
executing the query only against shard1 !

Note: I peeked into the SolrQueryRequest object using a debugger and it has
a reference to a SolrCore object which just points to shard1.

Request: http://localhost:xxx/solr/collection1/plugin?q=blahblah

Am I doing something wrong? Is my expectation of how it should work flawed?

Any help would be appreciated.

Regards

CV


Re: Difference in WordDelimiterFilter behavior between 4.7.2 and 4.9.1

2015-07-08 Thread Jack Krupansky
In Lucene 4.8, LUCENE-5111: Fix WordDelimiterFilter offsets

https://issues.apache.org/jira/browse/LUCENE-5111

Make sure the documents are queried and indexed with the same Lucene match
version.


-- Jack Krupansky

On Wed, Jul 8, 2015 at 5:19 PM, Shawn Heisey apa...@elyograg.org wrote:

 On 7/8/2015 2:19 PM, Shawn Heisey wrote:
  It appears that changing luceneMatchVersion from LUCENE_4_9 to LUCENE_47
  has fixed this problem ... so I think somebody must have fixed WDF to
  its current behavior, but put in a version check for the old behavior.

 The luceneMatchVersion change has fixed this specific issue with WDF,
 but these searches on 4.9.1 are still returning zero hits, and I don't
 yet know why.

 Thanks,
 Shawn




RE: About indexing embed file with solr

2015-07-08 Thread Allison, Timothy B.
This may have been an issue with Solr's wrapper of Tika.  See: 
https://issues.apache.org/jira/browse/SOLR-7189

-Original Message-
From: 步青云 [mailto:mailliup...@qq.com] 
Sent: Wednesday, June 17, 2015 10:17 PM
To: solr-user
Subject: About indexing embed file with solr

Hello,
  Could anyone recieve my email? I'm new to solr and I have some questions, 
could anyone help me to give me some answer?
  I index file directly by extracting the content of file using Tika 
embeded in solr. There is no problem of normal files. While I index a word 
embeded an another file, such as a pdf file embed in a word, I couldn't get the 
content of embeded file. For example, I have a word(doc) and there is a pdf 
embeded in the word(doc), I couldn't index the content of the pdf file. While 
using the same jar of Tika to extract the content of embed file, I can get the 
content of embeded file.
  I know Tika could extract the embed file since version 1.3. And the 
version of my solr is 4.9.1, Tika used in this version of solr is 1.5. I don't 
know why I can't get the content of embed file.
  Could anyone help me? Thank you very much.


   Ping Liu


 18 June. 2015


Re: Adding field to query result

2015-07-08 Thread Maya G
Hey,

Thanks for your response.

Yes, I think what I'm looking for is a pseudo field.
Is the product function a funtion query?
I assume I can replace the the product function in an implementation of my
own.

BTW - is the score field a pseudo field?

Maya



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Adding-field-to-query-result-tp4216396p4216424.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Solr cache when using custom scoring

2015-07-08 Thread Mikhail Khludnev
On Wed, Jul 8, 2015 at 11:30 PM, amid a...@donanza.com wrote:

 The custom scoring code use a parameter which passed to the solr query,


this param should be evaluated in equals() and hashcode(). isn;t it?


-- 
Sincerely yours
Mikhail Khludnev
Principal Engineer,
Grid Dynamics

http://www.griddynamics.com
mkhlud...@griddynamics.com


RE: Can I instruct the Tika Entity Processor to skip the first page using the DIH?

2015-07-08 Thread Allison, Timothy B.
Unfortunately, no.  We can't even do that now with straight Tika.  I imagine 
this is for pdf files?  If you'd like to add this as a feature, please submit a 
ticket over on Tika.

-Original Message-
From: Paden [mailto:rumsey...@gmail.com] 
Sent: Wednesday, July 08, 2015 12:14 PM
To: solr-user@lucene.apache.org
Subject: Can I instruct the Tika Entity Processor to skip the first page using 
the DIH?

Hello, I'm using the DIH to import some files from one of my local
directories. However, every single one of these files has the same first
page. So I want to skip that first page in order to optimize search. 

Can this be accomplished by an instruction within the dataimporthandler or,
if not, how could you do this? 



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Can-I-instruct-the-Tika-Entity-Processor-to-skip-the-first-page-using-the-DIH-tp4216373.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Difference in WordDelimiterFilter behavior between 4.7.2 and 4.9.1

2015-07-08 Thread Shawn Heisey
On 7/8/2015 2:19 PM, Shawn Heisey wrote:
 It appears that changing luceneMatchVersion from LUCENE_4_9 to LUCENE_47
 has fixed this problem ... so I think somebody must have fixed WDF to
 its current behavior, but put in a version check for the old behavior.

The luceneMatchVersion change has fixed this specific issue with WDF,
but these searches on 4.9.1 are still returning zero hits, and I don't
yet know why.

Thanks,
Shawn



Re: Difference in WordDelimiterFilter behavior between 4.7.2 and 4.9.1

2015-07-08 Thread Alessandro Benedetti
Yes Shawn, I was raising the fact that I see strange values in the
positions as well.
You said you fixed going back with an old version ?
This should not be ok, I mean, I assume the latest version should be the
best…
Any idea or clarification guys ?

2015-07-08 21:10 GMT+01:00 Shawn Heisey apa...@elyograg.org:

 On 7/8/2015 9:26 AM, Alessandro Benedetti wrote:
  Taking a look into the documentation I see this inconsistent orderings in
  my opinion :

 Alessandro, thank you for your reply.  I couldn't really tell what you
 were saying.  I *think* you were agreeing with me that the current
 behavior seems like a problem, but I'm not really sure.

 At this point I think I should probably file a bug in Jira ... anyone
 have any thoughts on that?

 Thanks,
 Shawn




-- 
--

Benedetti Alessandro
Visiting card : http://about.me/alessandro_benedetti

Tyger, tyger burning bright
In the forests of the night,
What immortal hand or eye
Could frame thy fearful symmetry?

William Blake - Songs of Experience -1794 England


Re: Adding field to query result

2015-07-08 Thread Erick Erickson
Yes, product is a function query and yes you can write your own.

Score is a _really_ special field, accessed with even referenced
differently by just the plain score field. You can also use
doc transformers to return things like which shard the doc came
from, but that's a different syntax just to make it confusing
[shard], see: 
https://cwiki.apache.org/confluence/display/solr/Transforming+Result+Documents

And you can make a custom one of these too, a place to start
would be TestCustomDocTransformer in the Solr tests.

Best,
Erick

On Wed, Jul 8, 2015 at 2:00 PM, Maya G maiki...@gmail.com wrote:
 Hey,

 Thanks for your response.

 Yes, I think what I'm looking for is a pseudo field.
 Is the product function a funtion query?
 I assume I can replace the the product function in an implementation of my
 own.

 BTW - is the score field a pseudo field?

 Maya



 --
 View this message in context: 
 http://lucene.472066.n3.nabble.com/Adding-field-to-query-result-tp4216396p4216424.html
 Sent from the Solr - User mailing list archive at Nabble.com.


Re: Solr Boost Search word before Specific Content

2015-07-08 Thread Ahmet Arslan
Hi Jack,

Here is hypothetical example:

product_title_1 : dell laptop with laptop bag

product_title_2 : laptop bag with cover
product_title_3 : laptop bag and table

You create an artificial/additional field, 

before_field_1 : dell laptop
before_field_2 : laptop bag
before_field_3 : laptop bag

You can implement/embed any complex/custom logic (to indexing code) for 
obtaining values of this new boostable before_field.
You can even implement it in a custom update processor.

Then, at search time, use (e)Dismax's field boosting mechanism q=Laptop 
bagqf=product_title^0.3 before_field^0.7defType=edismax

Ahmet



On Wednesday, July 8, 2015 6:56 AM, JACK mfal...@gmail.com wrote:
Hi Ahmet,

Can you elaborate it more?
Is it possible to solve my problem in Solr 5.0.0?
if yes can just explain how?



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Solr-Boost-Search-word-before-Specific-Content-tp4216072p4216257.html

Sent from the Solr - User mailing list archive at Nabble.com.


Sorting documents by child documents

2015-07-08 Thread DorZion
Hey,

I'm using Solr 4.10.2 and I have child documents in every parent document.

Previously, I used FunctionQuery to sort the documents:

http://lucene.472066.n3.nabble.com/Sorting-documents-by-nested-child-docs-with-FunctionQueries-tp4209940.html
http://lucene.472066.n3.nabble.com/Sorting-documents-by-nested-child-docs-with-FunctionQueries-tp4209940.html
  

Now, I want to sort the documents by their child documents with normal
fields.

It doesn't work when I use the sort parameter. 

Thanks in advance,

Dor



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Sorting-documents-by-child-documents-tp4216263.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Jetty in Solr 5.2.0

2015-07-08 Thread Shawn Heisey
On 7/7/2015 10:51 AM, Steven White wrote:
 What I am faced with is this.  I have to create my own crawler, similar to
 DIH.  I have to deploy this on the same server as Solr (this is given, I
 cannot change it).  I have to manage this crawler just like I have to
 manage my Solr deployment using Solr API through HTTP request.  I figured
 if I deploy my application under Jetty, with Solr, then problem is solved.

At some point in the future, Jetty is expected to go away, with Solr
becoming a true standalone application.  There is no set timeframe for
this to happen.  It will hopefully happen before 6.0, but the work needs
to be *started* before any kind of guess can be made.

 The other option I looked at is writing my own handler for my crawler and
 plugging it into Solr's solrconfig.xml.  If I do this, then my crawler will
 run in the same JVM space as Solr, this is something I want to avoid.

If you install another webapp into the same Jetty as Solr, then it will
be running in the same JVM as Solr.  Jetty is the application that the
JVM runs, not Solr.  This is not very different from a handler in
solrconfig.xml.

 Yet another option is for me deploy a second instance of Jetty on the Solr
 server just for my crawler.  This is over kill in my opinion.
 
 What do folks think about this and what's the best way to approach this
 issue?  Deploy my crawler on a separate server is not an option and for my
 use case Solr will be used in a lightweight so there is plenty of CPU / RAM
 on this one server to host Solr and my crawler.

As you've already been told, it's a very strong recommendation that you
treat Solr as a standalone application and forget that it's running in a
standard servlet container.  That means that any other webapps, like the
crawler you mention, should be installed completely separately.

In my previous reply, I told you how you *could* install another
application into the Jetty included with Solr, but we don't recommend
it, because eventually you won't have that option.

Thanks,
Shawn



Re: Windows Version

2015-07-08 Thread Shawn Heisey
On 7/7/2015 10:43 AM, Allan Elkowitz wrote:
 So I am a newbie at Solr and am having trouble getting the examples working 
 on Windows 7.
 I downloaded and unzipped the distribution and have been able to get Solr up 
 and running.  I can access the admin page.  However, when I try to follow the 
 instructions for loading the examples I find that there is a file that I am 
 supposed to have called post.jar which I cannot find in the directory 
 specified, exampledocs.  There is a file called post in another directory 
 but it does not seem to be a .jar file.
 Two questions:
 1.  Has this been addressed on some site that I am not yet aware of?
 2.  What am I missing here?

The post.jar file is in example\exampledocs in the Solr 5.2.1 download.

The bin\post file is a shell script for Linux/UNIX systems that offers
easier access to the SimplePostTool class included in the solr-core jar.
 Unfortunately, no Windows equivalent (post.cmd) exists yet.

If you're getting the impression that Windows is a second-class citizen
around here, you are not really wrong.  A typical Solr user has found
that the free operating systems offer better performance and stability,
with the added advantage that they don't have to pay Microsoft a pile of
money in order to get useful work done.

Windows, especially the server operating systems, is a perfectly good
platform, but it's not free.

Thanks,
Shawn



Re: Jetty in Solr 5.2.0

2015-07-08 Thread Shawn Heisey
On 7/7/2015 10:03 AM, Steven White wrote:
 This may be a question to be posted on Jetty mailing list, but I figured I
 should start here first.
 
 Using Solr 5.2.0, when I start Solr, http://localhost:8983/solr/ is the
 entry point.  My question is:
 
 1) Where is solr on the file system?
 2) How can I add http://localhost:8983/MyHandler/ to Jetty?
 
 For #2, I'm exploring the possibility of using the existing Web Server to
 see if I can have an additional application running on the same host as
 Solr.

1) The answer to this question is not simple.  Solr is a Java servlet,
written using the servlet API.  The jetty home is the server
directory, and most everything else is relative to that location.

Solr comes in the download as the webapps/solr.war file (relative to
that server directory) ... which, like a jar file, is a zip archive.
The contexts/solr-jetty-context.xml file tells Jetty how to find that
war file, where to extract it, and what URL path (normally /solr) will
be used to access that application.

The .war archive normally gets extracted to solr-webapp/webapp, and that
is where Jetty finds all the bits that become Solr.  Solr has a home
directory, which defaults to ./solr (also relative to that server
directory), where the solr.xml file tells Solr how to locate everything
else.  The solr home can be overridden with commandline options.

Your question number 2 is indeed more properly addressed on the Jetty
list.  If what I've written below is not enough, ask further questions
there.

2) You need to write (or find) a servlet and install its .war file into
Jetty with a context fragment as we have done with Solr.  A servlet
container like Jetty is more complicated than a typical webserver like
Apache httpd.  It runs Java servlet applications, rather than simply
serving html files and other similar resources out of a document root.

A servlet can (and usually does) have static resources like html and
image files.  Solr's admin interface is mostly static html, css, images,
and javascript that runs in the user's browser and pulls dynamic info
from system handlers within Solr.

Thanks,
Shawn



Re: Difference in WordDelimiterFilter behavior between 4.7.2 and 4.9.1

2015-07-08 Thread Yonik Seeley
On Wed, Jul 8, 2015 at 6:50 PM, Shawn Heisey apa...@elyograg.org wrote:
 After the fix (with luceneMatchVersion at 4.9), both aaa and bbb end
 up at position 2.

Yikes, that's definitely wrong.

-Yonik


Re: Too many Soft commits and opening searchers realtime

2015-07-08 Thread Summer Shire
Yonik, Mikhail, Alessandro 

After a lot of digging around and isolation, All u guys were right. I was using 
property based value and there was one place where it was 30 secs and that was 
overriding my main props. 

Also Yonik thanks for the explanation on the real time searcher. I wasn't sure 
if the maxwarmingSearcher error I was getting also had something to do with it. 

Thanks a lot 

 On Jul 8, 2015, at 5:28 AM, Yonik Seeley ysee...@gmail.com wrote:
 
 A realtime searcher is necessary for internal bookkeeping / uses if a
 normal searcher isn't opened on a commit.
 This searcher doesn't have caches and hence doesn't carry the weight
 that a normal searcher would.  It's also invisible to clients (it
 doesn't change the view of the index for normal searches).
 
 Your hard autocommit at 8 minutes with openSearcher=false will trigger
 a realtime searcher to open on every 8 minutes along with the hard
 commit.
 
 -Yonik
 
 
 On Tue, Jul 7, 2015 at 5:29 PM, Summer Shire shiresum...@gmail.com wrote:
 HI All,
 
 Can someone help me understand the following behavior.
 I have the following maxTimes on hard and soft commits
 
 yet I see a lot of Opening Searchers in the log
 org.apache.solr.search.SolrIndexSearcher - Opening Searcher@1656a258[main] 
 realtime
 also I see a soft commit happening almost every 30 secs
 org.apache.solr.update.UpdateHandler - start 
 commit{,optimize=false,openSearcher=true,waitSearcher=true,expungeDeletes=false,softCommit=true,prepareCommit=false}
 autoCommit
 maxTime48/maxTime
 openSearcherfalse/openSearcher
 /autoCommit
 
 autoSoftCommit
 maxTime18/maxTime
 /autoSoftCommit
 I tried disabling softCommit by setting maxTime to -1.
 On startup solrCore recognized it and logged Soft AutoCommit: disabled
 but I could still see softCommit=true
 org.apache.solr.update.UpdateHandler - start 
 commit{,optimize=false,openSearcher=true,waitSearcher=true,expungeDeletes=false,softCommit=true,prepareCommit=false}
 autoSoftCommit
 maxTime-1/maxTime
 /autoSoftCommit
 
 Thanks,
 Summer


RE: EmbeddedSolrServer No such core: collection1

2015-07-08 Thread Chaushu, Shani
Hi,
My problem was that I didn't had core.properties file, so it couldn't create 
the core.

Thanks for the help 
Shani

-Original Message-
From: Upayavira [mailto:u...@odoko.co.uk] 
Sent: Sunday, July 05, 2015 18:25
To: solr-user@lucene.apache.org
Subject: Re: EmbeddedSolrServer No such core: collection1

Hi Shani,

What version of Solr are you using? The instructions you quote look like they 
are for something like 4.4 from what you have written below.

The below is cloned from one of my projects, and hacked without testing, but I 
hope it gives you the idea of how it can be done.

  public SolrServer getEmbeddedServer(String solrHome, String
  solrConfigurationPath, String myCore) {
// Create solr_home directory with solr.xml
new File(solrHome).mkdirs();
FileUtils.copyFile(new File(config.getSolrXmlPath()), new
File(solrHome, solr.xml));

// Create config dir for my new core
File myCoreConfig = new File(solrHome + / + myCore + /conf);
myCoreConfig.mkdirs();
FileUtils.copyDirectory(new File(solrConfigurationPath),
myCoreConfig);

// Create core.properties file
FileUtils.write(new File(core, core.properties), name= +
coreName);

   // Create CoreContainer and EmbeddedSolrServer
File solrXml = new File(solrHome, solr.xml);
CoreContainer coreContainer = CoreContainer.createAndLoad(solrHome,
solrXml);
EmbeddedSolrServer newServer = new EmbeddedSolrServer(coreContainer,
myCore);
  }


Upayavira

On Sun, Jul 5, 2015, at 01:17 PM, Chaushu, Shani wrote:
 Hi,
 I'm using EmbeddedSolrServer   for testing the solr.
 I went step by step in this instuctions (for solr 4) 
 https://wiki.searchtechnologies.com/index.php/Unit_Testing_with_Embedd
 ed_Solr I can see that the config loaded, but when I try to put 
 document, the error I get is:
 org.apache.solr.common.SolrException: No such core: collection1
 
 I'm sure it's something in the solr.xml, but I couldn't find the 
 issue, Any thought?
 
 in the solr.xml I have:
 solr
 
 solrcloud
 str name=host${host:}/str
 int name=hostPort${jetty.port:8983}/int
 str name=hostContext${hostContext:solr}/str
 int name=zkClientTimeout${zkClientTimeout:3}/int
 bool
 name=genericCoreNodeNames${genericCoreNodeNames:true}/bool
 /solrcloud
 
 solr persistent=true
 cores adminPath=collection1 defaultCoreName=collection1
 core name=collection1 instanceDir=collection1 /
 /cores
 /solr
 
 
 shardHandlerFactory name=shardHandlerFactory
  class=HttpShardHandlerFactory
 int name=socketTimeout${socketTimeout:0}/int
 int name=connTimeout${connTimeout:0}/int
 /shardHandlerFactory
 
 /solr
 
 Thanks,
 Shani
 
 
 -
 Intel Electronics Ltd.
 
 This e-mail and any attachments may contain confidential material for 
 the sole use of the intended recipient(s). Any review or distribution 
 by others is strictly prohibited. If you are not the intended 
 recipient, please contact the sender and delete all copies.
-
Intel Electronics Ltd.

This e-mail and any attachments may contain confidential material for
the sole use of the intended recipient(s). Any review or distribution
by others is strictly prohibited. If you are not the intended
recipient, please contact the sender and delete all copies.



RE: Solr Encoding Issue?

2015-07-08 Thread Tarala, Magesh
Shawn - Stupid coding error in my java code. Used default charset. Changed to 
UTF-8 and problem fixed. 

Thanks again!

-Original Message-
From: Tarala, Magesh 
Sent: Wednesday, July 08, 2015 8:11 PM
To: solr-user@lucene.apache.org
Subject: RE: Solr Encoding Issue?

Wow, that makes total sense. Thanks Shawn!! 

I'll go down this path. 

Thanks,
Magesh

-Original Message-
From: Shawn Heisey [mailto:apa...@elyograg.org] 
Sent: Wednesday, July 08, 2015 7:24 PM
To: solr-user@lucene.apache.org
Subject: Re: Solr Encoding Issue?

On 7/8/2015 6:09 PM, Tarala, Magesh wrote:
 I believe the issue is in solr. The character “à” is getting stored in solr 
 as “Ã ”. Notice the space after Ã.

 I'm using solrj to ingest the documents into solr. So, one of those could be 
 the culprit?

Solr accepts and outputs text in UTF-8.  The UTF-8 hex encoding for the à 
character is C3A0.

In the latin1 character set, hex C3 is the à character.  Similarly, in latin1, 
hex A0 is a non-breaking space.

So it sounds like your input is encoded as UTF-8, therefore that character in 
your input source is hex c3a0, but something in your indexing process is 
incorrectly interpreting the UTF-8 representation as latin1, so it sees it as 
à .

SolrJ is faithfully converting that input to UTF-8 and sending it to Solr.

Thanks,
Shawn



RE: Do I really need copyField when my app can do the copy?

2015-07-08 Thread Petersen, Robert
Perhaps some people like maybe those using DIH to feed their index might not 
have that luxury and copyfield is the better way for them.  If you have an 
application you can do it either way.  I have done both ways in different 
situations.

Robi

-Original Message-
From: Steven White [mailto:swhite4...@gmail.com] 
Sent: Wednesday, July 08, 2015 3:38 PM
To: solr-user@lucene.apache.org
Subject: Do I really need copyField when my app can do the copy?

Hi Everyone,

What good is the use of copyField in Solr's schema.xml if my application can do 
it into the designated field?  Having my application do so helps me simplify 
the schema.xml maintains task thus my motivation.

Thanks

Steve


Re: Jetty in Solr 5.2.0

2015-07-08 Thread Steven White
Thank you all for your help.  I will leave Solr as-is and not step on its
feet.

Steve

On Wed, Jul 8, 2015 at 2:29 AM, Shawn Heisey apa...@elyograg.org wrote:

 On 7/7/2015 10:51 AM, Steven White wrote:
  What I am faced with is this.  I have to create my own crawler, similar
 to
  DIH.  I have to deploy this on the same server as Solr (this is given, I
  cannot change it).  I have to manage this crawler just like I have to
  manage my Solr deployment using Solr API through HTTP request.  I figured
  if I deploy my application under Jetty, with Solr, then problem is
 solved.

 At some point in the future, Jetty is expected to go away, with Solr
 becoming a true standalone application.  There is no set timeframe for
 this to happen.  It will hopefully happen before 6.0, but the work needs
 to be *started* before any kind of guess can be made.

  The other option I looked at is writing my own handler for my crawler and
  plugging it into Solr's solrconfig.xml.  If I do this, then my crawler
 will
  run in the same JVM space as Solr, this is something I want to avoid.

 If you install another webapp into the same Jetty as Solr, then it will
 be running in the same JVM as Solr.  Jetty is the application that the
 JVM runs, not Solr.  This is not very different from a handler in
 solrconfig.xml.

  Yet another option is for me deploy a second instance of Jetty on the
 Solr
  server just for my crawler.  This is over kill in my opinion.
 
  What do folks think about this and what's the best way to approach this
  issue?  Deploy my crawler on a separate server is not an option and for
 my
  use case Solr will be used in a lightweight so there is plenty of CPU /
 RAM
  on this one server to host Solr and my crawler.

 As you've already been told, it's a very strong recommendation that you
 treat Solr as a standalone application and forget that it's running in a
 standard servlet container.  That means that any other webapps, like the
 crawler you mention, should be installed completely separately.

 In my previous reply, I told you how you *could* install another
 application into the Jetty included with Solr, but we don't recommend
 it, because eventually you won't have that option.

 Thanks,
 Shawn




Best way to facets with value preprocessing (w/ docValues)

2015-07-08 Thread Konstantin Gribov
Hi, folks.

Earlier I used solr.TextField with preprocessing (ASCII folding, lowercase
etc) on some fields for search and faceting. But on larger index it takes
several minutes to uninvert that fields for faceting (I use fieldValueCache
 warmup queries with facets). It becomes too expensive in case of frequent
soft commits (5-10 mins), so I want to migrate to docValues to avoid
uninvert phase.

Documentation[1] says that only Trie*Field, StrField and UUIDField (which
itself is subtype of StrField) support docValues=true.

I have tried two ways to workaround this issue:
1. Make a subtype of TextField which overrides `checkSchemaField`
efficiently turning docValues for this TextField on. All preprocessing is
specified in TokenizeChain analyzer with KeywordTokenizerFactory (so it
produces exactly one token for each value in this multivalued field),
defined via schema.xml. It seems to work but I haven't tested it under
load. What are potential caveats in such scheme? Why it isn't used in trunk
Solr?
2. Make subtype of StrField which will perform hardcoded preprocessing
(like ASCII folding, lowercasing) but I can't find appropriate point to
insert this behavior. The only working method was to override both
toInternal and createFields (since creating BytesRef for docValues don't
use toInternal there) and do value preprocessing there. What are potential
caveats? Search becomes case-insensitive (since toInternal is used by
createField and default tokenizer), facets become lowercase because
docValues created lowercase by createFields override.

StrField-based variant should be faster than TextField-based since
TokenStream is reused internally in first case and recreated on each doc
with TokenizedChain in second one. But StrField-based approach hardcodes
preprocessing.

Next issue is that I want to use prefix and suffix wildcard search for some
fields. As I understood from code it works only on TextField (because it
requires Analyzer to be an instance of TokenizerChain with
ReversedWildcardFilterFactory in TokenFilter chain). Should I use it in
StrField-based variant by overriding getIndexAnalyzer/getQueryAnalyzer or
it would break something?

[1]: https://cwiki.apache.org/confluence/display/solr/DocValues

-- 
Best regards,
Konstantin Gribov


Re: Tlog replay

2015-07-08 Thread Yonik Seeley
On Wed, Jul 8, 2015 at 12:31 PM, Summer Shire shiresum...@gmail.com wrote:
 Thanks Alessandro !

 Any idea on why I couldn't curl the solr core and pass the flag param ?

These flags are for internal use only.  Solr sets them, the client doesn't.

-Yonik


RE: Solr Encoding Issue?

2015-07-08 Thread Tarala, Magesh
Thanks Erick.

I believe the issue is in solr. The character “à” is getting stored in solr as 
“Ã ”. Notice the space after Ã.

I'm using solrj to ingest the documents into solr. So, one of those could be 
the culprit?


-Original Message-
From: Erick Erickson [mailto:erickerick...@gmail.com] 
Sent: Wednesday, July 08, 2015 1:36 PM
To: solr-user@lucene.apache.org
Subject: Re: Solr Encoding Issue?

Attachments are pretty aggressively stripped by the e-mail server, so there's 
nothing to see, you'll have to paste it somewhere else and provide a link.

Usually, though, this is a character set issue with the browser using a 
different charset than Solr, it's really the same character, just displayed 
differently.

Shot in the dark though.

Erick

On Wed, Jul 8, 2015 at 10:49 AM, Tarala, Magesh mtar...@bh.com wrote:

  I’m ingesting a .TXT file with HTML content into Solr. The content 
 has the following character highlighted below:

 The file we get from CRM (also attached):

 [image: cid:image001.png@01D0B972.75BE23F0]





 After ingesting into solr, I see a different character. This is query 
 response from solr management console.



 [image: cid:image003.png@01D0B972.D1AED290]





 Anybody know how I can prevent this from happening?



 Thanks!



Re: Difference in WordDelimiterFilter behavior between 4.7.2 and 4.9.1

2015-07-08 Thread Shawn Heisey
On 7/8/2015 4:01 PM, Jack Krupansky wrote:
 In Lucene 4.8, LUCENE-5111: Fix WordDelimiterFilter offsets

 https://issues.apache.org/jira/browse/LUCENE-5111

 Make sure the documents are queried and indexed with the same Lucene match
 version.

Since I have updated the luceneMatchVersion on the 4.9.1 version to
LUCENE_47, I am now reindexing it, to see if that helps.

I discovered that I had some information backwards in my previous
messages -- it is *index* time analysis that differs.  Query time
analysis is the same across versions.  The reindex may very well fix
this problem, but luceneMatchVersion is a band-aid, and I think there is
a bug to be fixed.

I have no doubt that LUCENE-5111 fixed a real issue, but I think it also
caused some new problems.

When faced with text like aaa-bbb, the original term (created by
preserveOriginal) ends up at relative position 1.  Prior to this fix,
the next terms will be aaa at position 1 and bbb at position 2.  The
aaabbb term created by the catenation option also ends up at position
2.  This arrangement makes perfect sense to me.

After the fix (with luceneMatchVersion at 4.9), both aaa and bbb end
up at position 2.  I can't see how it is logical to end up with these
positions.  It breaks phrase queries on my index because the query-time
analysis puts these two terms at position 1 and 2.

The WDF options I chose seemed logical to me when I made them (about
four years ago), but I admit that I don't remember the exact motivation
behind those choices.  You can find the entire fieldType definition in a
previous message on this thread.  The two analysis chains are the same
except for WDF options.  Should I use different options?

Index-time options:

|filter class=solr.WordDelimiterFilterFactory
  splitOnCaseChange=1
  splitOnNumerics=1
  stemEnglishPossessive=1
  generateWordParts=1
  generateNumberParts=1
  catenateWords=1
  catenateNumbers=1
  catenateAll=0
  preserveOriginal=1
/

Query-time options:
||filter class=solr.WordDelimiterFilterFactory
  splitOnCaseChange=1
  splitOnNumerics=1
  stemEnglishPossessive=1
  generateWordParts=1
  generateNumberParts=1
  catenateWords=0
  catenateNumbers=0
  catenateAll=0
  preserveOriginal=0
/|


Thanks,
Shawn



RE: Solr Encoding Issue?

2015-07-08 Thread Tarala, Magesh
Wow, that makes total sense. Thanks Shawn!! 

I'll go down this path. 

Thanks,
Magesh

-Original Message-
From: Shawn Heisey [mailto:apa...@elyograg.org] 
Sent: Wednesday, July 08, 2015 7:24 PM
To: solr-user@lucene.apache.org
Subject: Re: Solr Encoding Issue?

On 7/8/2015 6:09 PM, Tarala, Magesh wrote:
 I believe the issue is in solr. The character “à” is getting stored in solr 
 as “Ã ”. Notice the space after Ã.

 I'm using solrj to ingest the documents into solr. So, one of those could be 
 the culprit?

Solr accepts and outputs text in UTF-8.  The UTF-8 hex encoding for the à 
character is C3A0.

In the latin1 character set, hex C3 is the à character.  Similarly, in latin1, 
hex A0 is a non-breaking space.

So it sounds like your input is encoded as UTF-8, therefore that character in 
your input source is hex c3a0, but something in your indexing process is 
incorrectly interpreting the UTF-8 representation as latin1, so it sees it as 
à .

SolrJ is faithfully converting that input to UTF-8 and sending it to Solr.

Thanks,
Shawn



Do I really need copyField when my app can do the copy?

2015-07-08 Thread Steven White
Hi Everyone,

What good is the use of copyField in Solr's schema.xml if my application
can do it into the designated field?  Having my application do so helps me
simplify the schema.xml maintains task thus my motivation.

Thanks

Steve


Re: Do I really need copyField when my app can do the copy?

2015-07-08 Thread Shawn Heisey
On 7/8/2015 4:38 PM, Steven White wrote:
 What good is the use of copyField in Solr's schema.xml if my application
 can do it into the designated field?  Having my application do so helps me
 simplify the schema.xml maintains task thus my motivation.

I can think of two main uses for copyField.  One is to combine the
inputs for multiple fields into a catchall field, the other is to
analyze the same input in multiple ways.  For instance, you may want a
field analyzed in one way for searching, but analyzed in a different way
to use for facets.

Your indexing application can indeed take care of that, but having Solr
do it means that your indexing application doesn't need to worry about
how the data is being used in search, it just has to get the information
to Solr.

There may be additional use cases, but those are the ones that came to
me when I thought about it for a couple of minutes.

Thanks,
Shawn



Re: Solr Encoding Issue?

2015-07-08 Thread Shawn Heisey
On 7/8/2015 6:09 PM, Tarala, Magesh wrote:
 I believe the issue is in solr. The character “à” is getting stored in solr 
 as “Ã ”. Notice the space after Ã.

 I'm using solrj to ingest the documents into solr. So, one of those could be 
 the culprit?

Solr accepts and outputs text in UTF-8.  The UTF-8 hex encoding for the
à character is C3A0.

In the latin1 character set, hex C3 is the à character.  Similarly, in
latin1, hex A0 is a non-breaking space.

So it sounds like your input is encoded as UTF-8, therefore that
character in your input source is hex c3a0, but something in your
indexing process is incorrectly interpreting the UTF-8 representation as
latin1, so it sees it as à .

SolrJ is faithfully converting that input to UTF-8 and sending it to Solr.

Thanks,
Shawn



Re: Solr cache when using custom scoring

2015-07-08 Thread amid
No sure I get you, the parameter is passed to solr as a string.
It seems like solr use for the caching key only the query, sort and range of
documents
(from the doc - This cache holds the results of previous searches: ordered
lists of document IDs (DocList) based on a query, a sort, and the range of
documents requested)

Searching for a good way to make sure this parameter will be used as well so
different parameters values with the same query will create different cache
keys




--
View this message in context: 
http://lucene.472066.n3.nabble.com/Solr-cache-when-using-custom-scoring-tp4216419p4216479.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Grouping and recip function not working with Sharding

2015-07-08 Thread Pankaj Sonawane
Hi Erick,

Below example is for grouping issue not for sorting.
I have indexed 1839 records with 'NAME' field in all, There may be
duplicate record for each 'NAME' value.

Let say
There are 5 records with NAME='A-SERIES',similarly 3 records with
NAME='E-SERIES' etc.

I have total 264 unique NAME values. So when I query collection using
grouping it should return 264 unique groups with ngroups value as 264.

But query returns response with ngroups as 558, however length of
groups array in response is 264.



{
   responseHeader:{
  status:0,
  QTime:19,
  params:{
 group.ngroups:true,
 indent:true,
 q:*:*,
 group.field:NAME,
 group:true,
 wt:json
  }
   },
   grouped:{
  NAME:{
 matches:1839,
 ngroups:558, - This value should be 264
 groups:[
{
   groupValue:A-SERIES,
   doclist:{

   }
},
{
   groupValue:B-SERIES,
   doclist:{

   }
},
{
   groupValue:C-SERIES,
   doclist:{

   }
},
   ---Similarly there are total 264 such groups
 ]
  }
   }
}


From the reference guide:

group.ngroups and group.facet require that all documents in each group
must be co-located on the same shard in order for accurate counts to
be returned. Document routing via composite keys can be a useful
solution in many situations.

It's not clear what you think the prolbem here is. You say:
bq: Ex: Below response contains 5 groups (Which is correct) but
ngroups is 11. But you have rows set to 5 so?

As far as your sorting issue, again an example showing what you think
is wrong would be very helpful.

Best,
Erick



On Wed, Jul 8, 2015 at 6:38 AM, Pankaj Sonawane
pankaj4sonaw...@gmail.com wrote:
 Hi,

 I am using sharding (3 shards) with Zookeeper.

 When I query a collection using 
 *group=truegroup.field=NAMEgroup.ngroups=true* parameters, *ngroups* in
 response is incorrect. However I am getting correct count in doclist array.

 Ex: Below response contains 5 groups (Which is correct) but ngroups is 11.

 {
responseHeader:{
   status:0,
   QTime:49,
   params:{
  group.ngroups:true,
  indent:true,
  start:0,
  q:*:*,
  group.field:NAME,
  group:true,
  wt:json,
  rows:5
   }
},
grouped:{
   NAME:{
  matches:18,
  ngroups:11,
  groups:[
 {
groupValue:A-SERIES,
doclist:{
   numFound:5,
   start:0,
   maxScore:1,
   docs:[
  {
 NAME:A-SERIES,
 _version_:1505559209034383400
  }
   ]
}
 },
 {
groupValue:B-SERIES,
doclist:{
   numFound:5,
   start:0,
   docs:[
  {
 NAME:B-SERIES,
 _version_:1505559209034383400
  }
   ]
}
 },
 {
groupValue:C-SERIES,
doclist:{
   numFound:1,
   start:0,
   docs:[
  {
 NAME:C-SERIES,
 _version_:1505559209034383400
  }
   ]
}
 },
 {
groupValue:D-SERIES,
doclist:{
   numFound:5,
   start:0,
   docs:[
  {
 NAME:D-SERIES,
 _version_:1505559209034383400
  }
   ]
}
 },
 {
groupValue:E-SERIES,
doclist:{
   numFound:3,
   start:0,
   maxScore:1,
   docs:[
  {
 NAME:E-SERIES,
 _version_:1505559209034383400
  }
   ]
}
 }
  ]
   }
}
 }

 I am facing same problem with Recip function to get latest record on some
 date field when using sharding. It returns back records in wrong order.

 Note: Same configuration works fine on single machine without sharding.

 Please Help me to find solution.

 Thanks.



On Wed, Jul 8, 2015 at 7:08 PM, Pankaj Sonawane pankaj4sonaw...@gmail.com
wrote:

 Hi,

 I am using sharding (3 shards) with Zookeeper.

 When I query a collection using 
 *group=truegroup.field=NAMEgroup.ngroups=true* parameters, *ngroups* in
 response is 

Re: Grouping and recip function not working with Sharding

2015-07-08 Thread Pankaj Sonawane


Erick Erickson erickerickson at gmail.com writes:

 
 From the reference guide:
 
 group.ngroups and group.facet require that all documents in each group
 must be co-located on the same shard in order for accurate counts to
 be returned. Document routing via composite keys can be a useful
 solution in many situations.
 
 It's not clear what you think the prolbem here is. You say:
 bq: Ex: Below response contains 5 groups (Which is correct) but
 ngroups is 11. But you have rows set to 5 so?
 
 As far as your sorting issue, again an example showing what you think
 is wrong would be very helpful.
 
 Best,
 Erick
 
 On Wed, Jul 8, 2015 at 6:38 AM, Pankaj Sonawane
 pankaj4sonawane at gmail.com wrote:
  Hi,
 
  I am using sharding (3 shards) with Zookeeper.
 
  When I query a collection using 
  *group=truegroup.field=NAMEgroup.ngroups=true* parameters, 
*ngroups* in
  response is incorrect. However I am getting correct count in doclist 
array.
 
  Ex: Below response contains 5 groups (Which is correct) but ngroups 
is 11.
 
  {
 responseHeader:{
status:0,
QTime:49,
params:{
   group.ngroups:true,
   indent:true,
   start:0,
   q:*:*,
   group.field:NAME,
   group:true,
   wt:json,
   rows:5
}
 },
 grouped:{
NAME:{
   matches:18,
   ngroups:11,
   groups:[
  {
 groupValue:A-SERIES,
 doclist:{
numFound:5,
start:0,
maxScore:1,
docs:[
   {
  NAME:A-SERIES,
  _version_:1505559209034383400
   }
]
 }
  },
  {
 groupValue:B-SERIES,
 doclist:{
numFound:5,
start:0,
docs:[
   {
  NAME:B-SERIES,
  _version_:1505559209034383400
   }
]
 }
  },
  {
 groupValue:C-SERIES,
 doclist:{
numFound:1,
start:0,
docs:[
   {
  NAME:C-SERIES,
  _version_:1505559209034383400
   }
]
 }
  },
  {
 groupValue:D-SERIES,
 doclist:{
numFound:5,
start:0,
docs:[
   {
  NAME:D-SERIES,
  _version_:1505559209034383400
   }
]
 }
  },
  {
 groupValue:E-SERIES,
 doclist:{
numFound:3,
start:0,
maxScore:1,
docs:[
   {
  NAME:E-SERIES,
  _version_:1505559209034383400
   }
]
 }
  }
   ]
}
 }
  }
 
  I am facing same problem with Recip function to get latest record on 
some
  date field when using sharding. It returns back records in wrong 
order.
 
  Note: Same configuration works fine on single machine without 
sharding.
 
  Please Help me to find solution.
 
  Thanks.
 
 


Hi Erick,

Below example is for grouping issue not for sorting. 
I have indexed 1839 records with 'NAME' field in all, There may be 
duplicate record for each 'NAME' value.

Let say 
There are 5 records with NAME='A-SERIES',similarly 3 records with 
NAME='E-SERIES' etc.

I have total 264 unique NAME values. So when I query collection using 
grouping it should return 264 unique groups with ngroups value as 264.

But query returns response with ngroups as 558, however length of 
groups array in response is 264.



{  
   responseHeader:{  
  status:0,
  QTime:19,
  params:{  
 group.ngroups:true,
 indent:true,
 q:*:*,
 group.field:NAME,
 group:true,
 wt:json
  }
   },
   grouped:{  
  NAME:{  
 matches:1839,
 ngroups:558, - This value should be 264
 groups:[  
{  
   groupValue:A-SERIES,
   doclist:{  

   }
},
{  
   groupValue:B-SERIES,
   doclist:{  

   }
},
{  
   groupValue:C-SERIES,
   doclist:{  

   }
},
   ---Similarly there are total 264 such groups
 ]
  }
   }
}