Re: changing state.json using ZKCLI

2017-01-10 Thread Shawn Heisey
On 1/10/2017 5:28 PM, Chetas Joshi wrote:
> I have got 2 shards having hash range set to null due to some index
> corruption.
>
> I am trying to manually get, edit and put the file.

> ./zkcli.sh -zkhost ${zkhost} -cmd putfile ~/colName_state.json
> /collections/colName/state.json
>
> I am getting FileNotFound exception with the putfile command

You've got the parameters backwards.  The zookeeper location comes first.

Run the zkcli script with no parameters to see the examples of the
various commands available, which includes this line:

zkcli.sh -zkhost localhost:9983 -cmd putfile /solr.xml
/User/myuser/solr/solr.xml

Thanks,
Shawn



RE: reset version number

2017-01-10 Thread Kris Musshorn
Obviously deleting and rebuilding the core will work but is there another way?
K

-Original Message-
From: KRIS MUSSHORN [mailto:mussho...@comcast.net] 
Sent: Tuesday, January 10, 2017 12:00 PM
To: solr-user@lucene.apache.org
Subject: reset version number

SOLR 5.4.1 web admin interface has a version number in the selected core's 
overview. 
How does one reset this number? 

Kris 



changing state.json using ZKCLI

2017-01-10 Thread Chetas Joshi
Hello,

I have got 2 shards having hash range set to null due to some index
corruption.

I am trying to manually get, edit and put the file.

./zkcli.sh -zkhost ${zkhost} -cmd getfile /collections/colName/state.json
~/colName_state.json


./zkcli.sh -zkhost ${zkhost} -cmd clear /collections/colName/state.json


./zkcli.sh -zkhost ${zkhost} -cmd putfile ~/colName_state.json
/collections/colName/state.json


I am getting FileNotFound exception with the putfile command


Exception in thread "main" java.io.FileNotFoundException:
/collections/colName/state.json (No such file or directory)

at java.io.FileInputStream.open(Native Method)

at java.io.FileInputStream.(FileInputStream.java:146)

at java.io.FileInputStream.(FileInputStream.java:101)


I just got the file from the same location.

Why is it throwing this exception?

How should I find out the correct location on the zookeeper node?


Thanks!


Re: how to achieve mulitple wild card searches in solr 5.2.1

2017-01-10 Thread Erick Erickson
I really don't understand what you mean by "compress", maybe
provide a couple of samples?

Best,
Erick

On Tue, Jan 10, 2017 at 10:20 AM, dinesh naik  wrote:
> Thanks Erick,
> I tried making it to String, but i need to compress the part first and then
> look for wild card search?
>
> With string i can not do that.
> How do i achieve this?
>
> On Wed, Jan 4, 2017 at 2:52 AM, Erick Erickson 
> wrote:
>
>> My guess is that you're searching on a _tokenized_ field and that
>> you'd get the results you expect on a string field..
>>
>> Add =query to the URL and you'll see what the parsed query is
>> and that'll give you a very good idea of what's acaully happening.
>>
>> Best,
>> Erick
>>
>> On Tue, Jan 3, 2017 at 7:16 AM, dinesh naik 
>> wrote:
>> > Hi all,
>> > How can we achieve multiple wild card searches in solr?
>> >
>> > For example: I am searching for AB TEST1.EC*TEST2*
>> > But I get also results for AB TEST1.EC*TEST3*, AB TEST1.EC*TEST4*,?
>> instead
>> > of AB TEST1.EC*TEST2*
>> >
>> > It seems only the first * is being considered, second * is not considered
>> > for wildcard match.
>> > --
>> > Best Regards,
>> > Dinesh Naik
>>
>
>
>
> --
> Best Regards,
> Dinesh Naik


regarding extending classes in org.apache.solr.client.solrj.io.stream.metrics package

2017-01-10 Thread radha krishnan
 Hi,

i want to extend the update(Tuple tuple) method in MaxMetric,. MinMetric,
SumMetric, MeanMetric classes.

can you please make the below metioned variables and methods in the above
mentioned classes as protected so that it will be easy to extend

variables
---

longMax

doubleMax

columnName


and

methods

---

init



Thanks,

Radhakrishnan D


Support Multiple (AND) Context Filter Query in Suggestor

2017-01-10 Thread Jeffery Yuan
Just as the normal query, usually we want to use multiple filter query when
run auto-completion.

It would be great if suggestor can return (the title of) doc that is
meaningful to the current user where we need multiple filters.

I am wondering whether it's possible in the current Solr(6.4)
implementation?

Thanks a lot.
Jeffery Yuan



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Support-Multiple-AND-Context-Filter-Query-in-Suggestor-tp4313416.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Missing shards/hash range

2017-01-10 Thread Chetas Joshi
Want to add a couple of things

1) Shards were not deleted using the delete replica collection API
endpoint.
2) instanceDir and dataDir exist for all 20 shards.

On Tue, Jan 10, 2017 at 11:34 AM, Chetas Joshi 
wrote:

> Hello,
>
> The following is my config
>
> Solr 5.5.0 on HDFS (SolrCloud of 25 nodes)
> collection with shards=20, maxShards per node=1, replicationFactor=1,
> autoAddReplicas=true
>
> The ingestion process had been working fine for the last 3 months.
>
> Yesterday, the ingestion process started throwing the following exceptions:
> SolrException: No active slice servicing hash code 7270a60c in
> DocCollection()
>
> I can see that suddenly 2 shards missing. Solr Cloud UI says number of
> shards for the collection are 18. Somehow, shards have got deleted. The
> data is available on hdfs.
>
> Is there a way I can restart those shards on 2 of the hosts and provide a
> particular hash range(The hash ranges that are missing) ?
>
> Thanks!
>
>


Missing shards/hash range

2017-01-10 Thread Chetas Joshi
Hello,

The following is my config

Solr 5.5.0 on HDFS (SolrCloud of 25 nodes)
collection with shards=20, maxShards per node=1, replicationFactor=1,
autoAddReplicas=true

The ingestion process had been working fine for the last 3 months.

Yesterday, the ingestion process started throwing the following exceptions:
SolrException: No active slice servicing hash code 7270a60c in
DocCollection()

I can see that suddenly 2 shards missing. Solr Cloud UI says number of
shards for the collection are 18. Somehow, shards have got deleted. The
data is available on hdfs.

Is there a way I can restart those shards on 2 of the hosts and provide a
particular hash range(The hash ranges that are missing) ?

Thanks!


Re: Query Elevation Component as a Managed Resource

2017-01-10 Thread Jeffery Yuan
Hi, kamaci:

  That's great :) It's so nice of you to create the path and implement the
feature which are wanted for a long time :)

Best,
Jeffery Yuan



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Query-Elevation-Component-as-a-Managed-Resource-tp4312089p4313380.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: CDCR logging is Needlessly verbose, fills up the file system fast

2017-01-10 Thread Webster Homer
I know that we have never set the schedule parameter to 1 millisecond. We
have specified either 100 or 1000. I wondered why it was writing so
frequently. I suspect a bug somewhere

However, we will have multiple collections using cdcr, and in some cases
the source collection will have multiple targets.

I understand your point about the request handler, I think being able to
specify messaging levels for specific handlers is essential

On Mon, Jan 9, 2017 at 10:07 PM, Shawn Heisey  wrote:

> On 12/22/2016 8:10 AM, Webster Homer wrote:
> > While testing CDCR I found that it is writing tons of log messages per
> > second. Example:
> > 2016-12-21 23:24:41.652 INFO  (qtp110456297-13) [c:sial-catalog-material
> > s:shard1 r:core_node1 x:sial-catalog-material_shard1_replica1]
> > o.a.s.c.S.Request [sial-catalog-material_shard1_replica1]  webapp=/solr
> > path=/cdcr params={qt=/cdcr=BOOTSTRAP_STATUS=javabin&
> version=2}
> > status=0 QTime=0
> > 2016-12-21 23:24:41.653 INFO  (qtp110456297-18) [c:sial-catalog-material
> > s:shard1 r:core_node1 x:sial-catalog-material_shard1_replica1]
> > o.a.s.c.S.Request [sial-catalog-material_shard1_replica1]  webapp=/solr
> > path=/cdcr params={qt=/cdcr=BOOTSTRAP_STATUS=javabin&
> version=2}
> > status=0 QTime=0
>
> I hadn't looked closely at the messages you were seeing in your logs
> until now.
>
> These messages are *request* logging.  This is the same code path that
> logs every query -- it's not specific to CDCR.  It's just logging all
> the requests that Solr is receiving.  If this log message were changed
> to DEBUG, then Solr would not log queries by default.  A large number of
> Solr users want that logging.
>
> I think that you could probably avoid seeing these logs by configuring
> log4j to not log things tagged
> asorg.apache.solr.core.SolrCore.Request(even though it's not a real
> class, I think log4j can still configure it) ... but then you wouldn't
> get your queries logged either.
>
> In order to not log these particular messages, but still log queries and
> other requests, the request logging code will need to have a way to
> specify that certain messages should not be logged.  This might be
> something thatcould beconfigurable at the request handler definition
> level -- put something in the requestHandler configuration (for /cdcr in
> this case) that tells it to skip logging.  That seems like a good
> feature to have.
>
> After looking at the CDCR configuration page in the reference guide, I
> might have a little more insight.  You're getting one of these logs
> every 1-2 milliseconds ... so it sounds like you have configured the
> CDCR with a schedule of one millisecond.  The default value for the
> replicator schedule is is ten milliseconds, and the update log
> synchronizer defaults to a full minute.  I'm guessing that CDCR is not
> designed to have such a low schedule value.  I would personally
> configure the replicator schedule even higher than the default --
> network latency between Internet locations is often longer than ten
> milliseconds.
>
> Thanks,
> Shawn
>
>

-- 


This message and any attachment are confidential and may be privileged or 
otherwise protected from disclosure. If you are not the intended recipient, 
you must not copy this message or attachment or disclose the contents to 
any other person. If you have received this transmission in error, please 
notify the sender immediately and delete the message and any attachment 
from your system. Merck KGaA, Darmstadt, Germany and any of its 
subsidiaries do not accept liability for any omissions or errors in this 
message which may arise as a result of E-Mail-transmission or for damages 
resulting from any unauthorized changes of the content of this message and 
any attachment thereto. Merck KGaA, Darmstadt, Germany and any of its 
subsidiaries do not guarantee that this message is free of viruses and does 
not accept liability for any damages caused by any virus transmitted 
therewith.

Click http://www.emdgroup.com/disclaimer to access the German, French, 
Spanish and Portuguese versions of this disclaimer.


reset version number

2017-01-10 Thread KRIS MUSSHORN
SOLR 5.4.1 web admin interface has a version number in the selected core's 
overview. 
How does one reset this number? 

Kris 


CDCR Alias support?

2017-01-10 Thread Webster Homer
Looking at the cdcr API and documentation I wondered if the source and
target collection names could be aliases. This is not discussed in the cdcr
documentation, when I have time I was going to test this, but if someone
knows for certain it might save some time.

-- 


This message and any attachment are confidential and may be privileged or 
otherwise protected from disclosure. If you are not the intended recipient, 
you must not copy this message or attachment or disclose the contents to 
any other person. If you have received this transmission in error, please 
notify the sender immediately and delete the message and any attachment 
from your system. Merck KGaA, Darmstadt, Germany and any of its 
subsidiaries do not accept liability for any omissions or errors in this 
message which may arise as a result of E-Mail-transmission or for damages 
resulting from any unauthorized changes of the content of this message and 
any attachment thereto. Merck KGaA, Darmstadt, Germany and any of its 
subsidiaries do not guarantee that this message is free of viruses and does 
not accept liability for any damages caused by any virus transmitted 
therewith.

Click http://www.emdgroup.com/disclaimer to access the German, French, 
Spanish and Portuguese versions of this disclaimer.


RE: Can't get spelling suggestions to work properly

2017-01-10 Thread jimi.hullegard
No one has any input on my post below about the spelling suggestions? I just 
find it a bit frustrating not being able to understand this feature better, and 
why it doesn't give the expected results. A built in "explain" feature really 
would have helped.

/Jimi

-Original Message-
From: jimi.hulleg...@svensktnaringsliv.se 
[mailto:jimi.hulleg...@svensktnaringsliv.se] 
Sent: Friday, December 16, 2016 9:58 PM
To: solr-user@lucene.apache.org
Subject: Can't get spelling suggestions to work properly

Hi,

I'm trying to add the spelling suggestion feature to our search, but I'm having 
problems getting suggestions on some misspellings.

For example, the Swedish word 'mycket' exists in ~14.000 of a total of ~40.000 
documents in our index.

A search for the incorrect spelling 'myket' (a missing 'c') gives several 
spelling suggestions, and the top one is 'mycket'. This is the wanted/expected 
behaivor.

But a search for the incorrect spelling 'mycet' (a missing 'k') gives no 
spelling suggestions.

The only difference between these two searches is that the one that results in 
spelling suggestions had zero results, while the other one had two (2) results. 
These two documents contain this incorrect spelling ('mycet'). Can this be the 
cause of no spelling suggestions? But I have set 'maxQueryFrequency' to 0.001, 
and with 40.000 documents in the index that should mean that the word can exist 
in up to 40 documents, and since 2 is less than 40 I argue that that this word 
would be considered a spelling misstake. But for some reason the solr 
spellchecker considers 'myket' as an incorrect spelling, while 'mycet' 
incorrectly is considered as a correct spelling.

Also, I tried with spellcheck.accuracy=0 just to rule out that I have a too 
high accuracy setting, but that didn't help.

Can someone see what I'm doing wrong, or give some tips on configuration 
changes and/or how I can troubleshoot this? For example, is there any way to 
debug the spellchecker function?


Here are the searches:

Search for 'myket':

http://localhost:8080/solr/s2/select/?q=myket=100=score+desc=*%2Cscore%2C%5Bexplain+style%3Dtext%5D=edismax=title%5E2=swedishText1%5E1=true=0=200=%2Bactivatedate%3A%5B*+TO+NOW%5D+%2Bexpiredate%3A%5BNOW+TO+*%5D+%2B%28state%3Apublished+OR+state%3Adraft-published+OR+state%3Asubmitted-published+OR+state%3Aapproved-published%29=xml=true

Spellcheck output for 'myket':


 
  

   16

   0

   5

   0

   



 mycket

 14039



[...]

   
  
  false
  

   mycket

   14005

   

mycket

   
  
  [...]
  
 



Spellcheck output for 'mycet':

http://localhost:8080/solr/s2/select/?q=mycet=100=score+desc=*%2Cscore%2C%5Bexplain+style%3Dtext%5D=edismax=title%5E2=swedishText1%5E1=true=0=200=%2Bactivatedate%3A%5B*+TO+NOW%5D+%2Bexpiredate%3A%5BNOW+TO+*%5D+%2B%28state%3Apublished+OR+state%3Adraft-published+OR+state%3Asubmitted-published+OR+state%3Aapproved-published%29=xml=true

Search for 'mycet':


RE: ICUFoldingFilter with swedish characters, and tokens with the keyword attribute?

2017-01-10 Thread jimi.hullegard
Hi Eric.

> But that's not the most important bit. Have you considered something like 
> MappingCharFitlerFactory? 
> Unfortunately that's a charFilter which transforms everything before it gets 
> to the repeatFilter so you'd have to use two fields.

Yes, that is actually what I tried after giving up the idea of being able to 
tweak (ie configure) the ICUFoldingFilter to work as I wanted it to. With a 
custom modification to an existing mapping file, with all the swedish 
characters commented out, it worked just fine.

The fact that it is a charFilter, and the problems that causes, is something I 
have already though about and considered to be OK. For now we won't add another 
field (for the non-folded text), but we might very well do that in the future.

/Jimi


Re: how to achieve mulitple wild card searches in solr 5.2.1

2017-01-10 Thread dinesh naik
Thanks Erick,
I tried making it to String, but i need to compress the part first and then
look for wild card search?

With string i can not do that.
How do i achieve this?

On Wed, Jan 4, 2017 at 2:52 AM, Erick Erickson 
wrote:

> My guess is that you're searching on a _tokenized_ field and that
> you'd get the results you expect on a string field..
>
> Add =query to the URL and you'll see what the parsed query is
> and that'll give you a very good idea of what's acaully happening.
>
> Best,
> Erick
>
> On Tue, Jan 3, 2017 at 7:16 AM, dinesh naik 
> wrote:
> > Hi all,
> > How can we achieve multiple wild card searches in solr?
> >
> > For example: I am searching for AB TEST1.EC*TEST2*
> > But I get also results for AB TEST1.EC*TEST3*, AB TEST1.EC*TEST4*,?
> instead
> > of AB TEST1.EC*TEST2*
> >
> > It seems only the first * is being considered, second * is not considered
> > for wildcard match.
> > --
> > Best Regards,
> > Dinesh Naik
>



-- 
Best Regards,
Dinesh Naik


RE: Debug logging in Maven project

2017-01-10 Thread Markus Jelsma
Ahá, i am stupid indeed. I forgot i also had to change slf4j-nop to 
slf4j-simple in my pom.xml..

    
  org.slf4j
  slf4j-simple
  1.7.21
  test
    

Sorry for the noise!

Markus

 
 
-Original message-
> From:Markus Jelsma 
> Sent: Tuesday 10th January 2017 15:10
> To: solr-user@lucene.apache.org
> Subject: RE: Debug logging in Maven project
> 
> Indeed, there were some changes recently but i also can't get logging to work 
> on older versions such as 6.0. 
> 
> Thanks,
> Markus
> 
>  
>  
> -Original message-
> > From:Pushkar Raste 
> > Sent: Tuesday 10th January 2017 14:53
> > To: solr-user@lucene.apache.org
> > Subject: Re: Debug logging in Maven project
> > 
> > Seems like you have enabled only console appender. I remember there was a
> > changed made to disable console appender if Solr is started in background
> > mode.
> > 
> > On Jan 10, 2017 5:55 AM, "Markus Jelsma"  wrote:
> > 
> > > Hello,
> > >
> > > I used to enable debug logging in my Maven project's unit tests by just
> > > setting log4j's global level to DEBUG, very handy, especially in debugging
> > > some Solr Cloud start up issues. Since a while, not sure to long, i don't
> > > seem to be able to get any logging at all. This project depends on 6.3.
> > > Anyone here that can tell me how to get something so simple but so helpful
> > > back to work?
> > >
> > > Many thanks,
> > > Markus
> > >
> > > $ cat src/test/resources/log4j.properties
> > > log4j.rootLogger=debug,info,stdout
> > > log4j.appender.stdout=org.apache.log4j.ConsoleAppender
> > > log4j.appender.stdout.layout=org.apache.log4j.PatternLayout
> > > log4j.appender.stdout.layout.ConversionPattern=%5p [%t] (%F:%L) - %m%n
> > >
> > >
> > 
> 


RE: Debug logging in Maven project

2017-01-10 Thread Markus Jelsma
Indeed, there were some changes recently but i also can't get logging to work 
on older versions such as 6.0. 

Thanks,
Markus

 
 
-Original message-
> From:Pushkar Raste 
> Sent: Tuesday 10th January 2017 14:53
> To: solr-user@lucene.apache.org
> Subject: Re: Debug logging in Maven project
> 
> Seems like you have enabled only console appender. I remember there was a
> changed made to disable console appender if Solr is started in background
> mode.
> 
> On Jan 10, 2017 5:55 AM, "Markus Jelsma"  wrote:
> 
> > Hello,
> >
> > I used to enable debug logging in my Maven project's unit tests by just
> > setting log4j's global level to DEBUG, very handy, especially in debugging
> > some Solr Cloud start up issues. Since a while, not sure to long, i don't
> > seem to be able to get any logging at all. This project depends on 6.3.
> > Anyone here that can tell me how to get something so simple but so helpful
> > back to work?
> >
> > Many thanks,
> > Markus
> >
> > $ cat src/test/resources/log4j.properties
> > log4j.rootLogger=debug,info,stdout
> > log4j.appender.stdout=org.apache.log4j.ConsoleAppender
> > log4j.appender.stdout.layout=org.apache.log4j.PatternLayout
> > log4j.appender.stdout.layout.ConversionPattern=%5p [%t] (%F:%L) - %m%n
> >
> >
> 


Re: Debug logging in Maven project

2017-01-10 Thread Pushkar Raste
Seems like you have enabled only console appender. I remember there was a
changed made to disable console appender if Solr is started in background
mode.

On Jan 10, 2017 5:55 AM, "Markus Jelsma"  wrote:

> Hello,
>
> I used to enable debug logging in my Maven project's unit tests by just
> setting log4j's global level to DEBUG, very handy, especially in debugging
> some Solr Cloud start up issues. Since a while, not sure to long, i don't
> seem to be able to get any logging at all. This project depends on 6.3.
> Anyone here that can tell me how to get something so simple but so helpful
> back to work?
>
> Many thanks,
> Markus
>
> $ cat src/test/resources/log4j.properties
> log4j.rootLogger=debug,info,stdout
> log4j.appender.stdout=org.apache.log4j.ConsoleAppender
> log4j.appender.stdout.layout=org.apache.log4j.PatternLayout
> log4j.appender.stdout.layout.ConversionPattern=%5p [%t] (%F:%L) - %m%n
>
>


RE: How to integrate SOLR in ibm filenet 5.2.1?

2017-01-10 Thread Markus Jelsma
The download links should work properly. Maybe try another mirror. I can 
confirm the download works fine:
http://manifoldcf.apache.org/en_US/download.html#Latest+2.x+release+%28Apache+ManifoldCF+2.6%2C+2016+Dec+30%29
 
 
-Original message-
> From:puneetmishra2555 
> Sent: Tuesday 10th January 2017 13:41
> To: solr-user@lucene.apache.org
> Subject: RE: How to integrate SOLR in ibm filenet 5.2.1?
> 
> Hi Team
> 
> Thanks for your response but I am not able to download Mainfold CF,so help
> me on downloading so that I can check whether it will work for filenet or
> not,because whatever link is given to download is not working.
> 
> 
> 
> --
> View this message in context: 
> http://lucene.472066.n3.nabble.com/How-to-integrate-SOLR-in-ibm-filenet-5-2-1-tp4313090p4313328.html
> Sent from the Solr - User mailing list archive at Nabble.com.
> 


How to integrate SOLR in ibm filenet 5.2.1?

2017-01-10 Thread puneetmishra2555
Hi Team 

Thanks for your response but I am not able to download Mainfold CF,so help
me on downloading so that I can check whether it will work for filenet or
not,because whatever link is given to download is not working. 




--
View this message in context: 
http://lucene.472066.n3.nabble.com/How-to-integrate-SOLR-in-ibm-filenet-5-2-1-tp4313330.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: ICUFoldingFilter with swedish characters, and tokens with the keyword attribute?

2017-01-10 Thread Erick Erickson
Jimi:

The critical line for the KeywordRepeatFilter is "This is useful if
used with a stem filter that respects the KeywordAttribute to index
the stemmed and the un-stemmed version of a term into the same
field.". There is no guarantee that all filters _after_ the
KeywordRepeatFilter respect the keyword attribute.

Doesn't seem like it would be difficult to add if you'd like to submit
a patch though.

But that's not the most important bit. Have you considered something
like MappingCharFitlerFactory? Unfortunately that's a charFilter which
transforms everything before it gets to the repeatFilter so you'd have
to use two fields.

Best,
Erick

On Tue, Jan 10, 2017 at 1:02 AM,   wrote:
> Hi,
>
> I wasn't happy with how our current solr configuration handled diacritics 
> (like 'é') in the text and in search queries, since it simply considered the 
> letter with a diacritic as a distinct letter. Ie 'é' didn't match 'e', and 
> vice versa. Except for a handful rare words where the diacritical sign in 'é' 
> actually change the word meaning, it is usually used in names of people and 
> places and the expected behaivor when searching is to not have to type them 
> and still get the expected results (like searching for 'Penelope Cruz' and 
> getting hits for 'Penélope Cruz').
>
> When reading online about how to handle diacritics in solr, it seems that the 
> general recommendation, when no language specific solution exists that 
> handles this, is to use the ICUFoldingFilter. However this filter doesn't 
> really come with a lot of documentation, and doesn't seem to have any 
> configuration options at all (at least not documented).
>
> So what I ended up with doing was simply to add the ICUFoldingFilterFactory 
> in the middle of the existing analyzer chain, like this:
>
> 
>  
>class="solr.HTMLStripCharFilterFactory" />
>class="solr.PatternReplaceCharFilterFactory" pattern="([.])" replacement=" " 
> />
>class="solr.StandardTokenizerFactory" />
>class="solr.LowerCaseFilterFactory" />
>class="solr.KeywordRepeatFilterFactory" />
>class="solr.ICUFoldingFilterFactory"/>
>class="solr.SwedishLightStemFilterFactory" />
>class="solr.RemoveDuplicatesTokenFilterFactory" />
>  
> 
>
>
> But that didn't really give me the results I want. For example, using the 
> analysis debug tool I see that the text 'café åäö' becomes 'cafe caf aao'. 
> And there are two problems with that result:
>
> 1. It doesn't respect keyword attribute
> 2. It folds the Swedish characters 'åäö' into 'aao'
>
> The disregard of the keyword attribute is bad enough, but the mangling of the 
> Swedish language is really a show stopper for us. The Swedish language 
> doesn't consider 'ö', for example, to be the letter 'o' with two diacritical 
> dots above it, just as 'Q' isn't considered to be the letter 'O' with a 
> diacritical "squiggly line" at the bottom. So when handling Swedish text, 
> these characters ('åäöÅÄÖ') shouldn't be folded, because then there will be 
> to many "collisions".
>
> For example, when searching for 'påstå' ('claim'), one doesn't want hits 
> about 'pasta' (you guessed it, it means 'pasta'), just as one doesn't want to 
> get hits about 'aga' ('corporal punishment, usually against children') when 
> searching for 'äga' ('to own'). Or even worse, when searching för 'höra' ('to 
> hear'), one most likely doesn't want hits about 'hora' ('prostitute'). And I 
> can go on... :)
>
> So, is there a way for us to make the ICUFoldingFilter work in a better way? 
> Ie configure it to respect the keyword attribute and ignore 'åäö' characters 
> when folding, but otherwise fold all diacritical characters into the 
> non-diacritical form. Or how would you recommend us to configure our analyzer 
> chain to acomplice this?
>
> Regards
> /Jimi


Re: Query Elevation Component as a Managed Resource

2017-01-10 Thread Furkan KAMACI
Hi Jeffery,

I was looking whether an issue is raised for it or not. Thanks for pointing
it, I'm planning to create a patch.

Kind Regards,
Furkan KAMACI


On Mon, Jan 9, 2017 at 6:44 AM, Jeffery Yuan  wrote:

> I am looking for same things.
>
> Seems Solr doesn't support this.
>
> Maybe you can vote for https://issues.apache.org/jira/browse/SOLR-6092, so
> add a patch for it :)
>
>
>
> --
> View this message in context: http://lucene.472066.n3.
> nabble.com/Query-Elevation-Component-as-a-Managed-
> Resource-tp4312089p4313034.html
> Sent from the Solr - User mailing list archive at Nabble.com.
>


Re: Duplicate suggestions when having multiple shards

2017-01-10 Thread Erick Erickson
As for the question about different weights, down the page of this
article there's an explanation of why stats are different on different
replicas in the same shard:
https://support.lucidworks.com/hc/en-us/articles/115000888308-Getting-different-results-while-issuing-a-query-multiple-times-in-SolrCloud-

A very similar argument holds for two shards with identical documents.

Don't have a good idea for your second question.

Best,
Erick

On Tue, Jan 10, 2017 at 4:14 AM, Nicole Bilić  wrote:
> Hi all,
>
> We are using Suggester (and Solr 6.3.0) to implement autocomplete. We are
> using TSTLookupFactory lookup implementation and
> HighFrequencyDictionaryFactory dictionary implementation. If our index
> consists of only one shard, everything works perfectly fine. However, when
> index is split in 2 shards, we start getting duplicated suggestions. We've
> done some testing and it is peculiar that even when we have exactly the
> same documents in both shards (just for testing purposes), we get
> duplicated terms, but their weights are slightly different. We haven't
> found any solution that works (we've tried FieldCollapsing as well). Due to
> some limitations, it is not an option to solve this (ie. throw out the
> duplicates)  client-side. Is there a way to solve this issue solr-side?
>
> Thanks!


RE: How to integrate SOLR in ibm filenet 5.2.1?

2017-01-10 Thread puneetmishra2555
Hi Team

Thanks for your response but I am not able to download Mainfold CF,so help
me on downloading so that I can check whether it will work for filenet or
not,because whatever link is given to download is not working.



--
View this message in context: 
http://lucene.472066.n3.nabble.com/How-to-integrate-SOLR-in-ibm-filenet-5-2-1-tp4313090p4313328.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Same score listing order

2017-01-10 Thread Erick Erickson
Minor pedantic point (I like those).

"equiv to the order in which they were added to the index" depends on
the merge policy. That was true when Yonik wrote it, but other merge
policies added since then may or may not preserve insertion order in
terms of the internal Lucene ID. The default TieredMergePolicy, for
instance does _not_.

So TieredMergePolicy in SolrCloud can lead to "interesting"
situations. The internal lucene doc ID of two particular docs on two
replicas of the _same_ shard are not even guaranteed to be in the same
relation to each other. On replica 1 doc1 could have id 23 and doc2
could have id 50, but on replica2 they could be 66 and 19
respectively.

Which means that Jimi is absolutely correct in the only way to make
predictions about tied sorts is to specify a secondary sort. If you
wanted to, for instance, preserve the insertion order you'd add a
counter to each doc and use that counter as a secondary sort.

Best,
Erick

On Tue, Jan 10, 2017 at 5:46 AM,   wrote:
> Hi Kshitij,
>
> Quoting Yonik, the creator of solr:
>
> "Ties are the same as in lucene... internal docid (equiv to the order in 
> which they were added to the index)."
>
> Also, you can have multiple sort clauses, where score can be the first one. 
> Like sort=score DESC, publishDate DESC. But I think the recommended approach 
> is to use boosting on date (etc) to effect the score instead.
>
> Hope this helps.
> /Jimi
>
> -Original Message-
> From: kshitij tyagi [mailto:kshitij.shopcl...@gmail.com]
> Sent: Tuesday, January 10, 2017 5:11 PM
> To: solr-user@lucene.apache.org
> Subject: Same score listing order
>
> Hi,
>
> I need to understand what is the order of listing the documents from query in 
> case there is same score for all documents.
>
> Regards,
> Kshitij


Division in JSON Facet

2017-01-10 Thread Zheng Lin Edwin Yeo
Hi,

I'm getting this error when I tried to do a division in JSON Facet.

  "error":{
"msg":"org.apache.solr.search.SyntaxError: Unknown aggregation
agg_div in ('div(4,2)', pos=4)",
"code":400}}


Is this division function supported in JSON Facet?

I'm using this in Solr 5.4.0

Regards,
Edwin


RE: Same score listing order

2017-01-10 Thread jimi.hullegard
Hi Kshitij,

Quoting Yonik, the creator of solr:

"Ties are the same as in lucene... internal docid (equiv to the order in which 
they were added to the index)."

Also, you can have multiple sort clauses, where score can be the first one. 
Like sort=score DESC, publishDate DESC. But I think the recommended approach is 
to use boosting on date (etc) to effect the score instead.

Hope this helps.
/Jimi

-Original Message-
From: kshitij tyagi [mailto:kshitij.shopcl...@gmail.com] 
Sent: Tuesday, January 10, 2017 5:11 PM
To: solr-user@lucene.apache.org
Subject: Same score listing order

Hi,

I need to understand what is the order of listing the documents from query in 
case there is same score for all documents.

Regards,
Kshitij


Debug logging in Maven project

2017-01-10 Thread Markus Jelsma
Hello,

I used to enable debug logging in my Maven project's unit tests by just setting 
log4j's global level to DEBUG, very handy, especially in debugging some Solr 
Cloud start up issues. Since a while, not sure to long, i don't seem to be able 
to get any logging at all. This project depends on 6.3. Anyone here that can 
tell me how to get something so simple but so helpful back to work? 

Many thanks,
Markus

$ cat src/test/resources/log4j.properties
log4j.rootLogger=debug,info,stdout
log4j.appender.stdout=org.apache.log4j.ConsoleAppender
log4j.appender.stdout.layout=org.apache.log4j.PatternLayout
log4j.appender.stdout.layout.ConversionPattern=%5p [%t] (%F:%L) - %m%n



Same score listing order

2017-01-10 Thread kshitij tyagi
Hi,

I need to understand what is the order of listing the documents from query
in case there is same score for all documents.

Regards,
Kshitij


Duplicate suggestions when having multiple shards

2017-01-10 Thread Nicole Bilić
Hi all,

We are using Suggester (and Solr 6.3.0) to implement autocomplete. We are
using TSTLookupFactory lookup implementation and
HighFrequencyDictionaryFactory dictionary implementation. If our index
consists of only one shard, everything works perfectly fine. However, when
index is split in 2 shards, we start getting duplicated suggestions. We've
done some testing and it is peculiar that even when we have exactly the
same documents in both shards (just for testing purposes), we get
duplicated terms, but their weights are slightly different. We haven't
found any solution that works (we've tried FieldCollapsing as well). Due to
some limitations, it is not an option to solve this (ie. throw out the
duplicates)  client-side. Is there a way to solve this issue solr-side?

Thanks!