Re: Stemmer and stopword Development

2015-09-10 Thread Upayavira
On Thu, Sep 10, 2015, at 04:45 AM, Imtiaz Shakil Siddique wrote: > Hi, > > I am trying to develop stemmer and stopword for Bengaly language which is > not shipped with solr. > > I am trying to make this with machine learning approach but I couldn't > find > any good documents to study. It

Re: Can solr ttf functionQuery support ngram (n>2) ?

2015-09-10 Thread Jie Gao
A typo is fixed in the following query url. On 10 September 2015 at 10:25, Jie Gao wrote: > Hi, > > I'm wondering whether solr ttf functionQuery support (compound words) > ngram (n>2) ? > > I'm using " >

Re: Boosting related doubt?

2015-09-10 Thread Upayavira
That's curious. Have a look at both the parsed query, and the explains output for a very simple (even *:*) query. You should see the boost present there and be able to see whether it is applied once or twice. Upayavira On Thu, Sep 10, 2015, at 06:16 AM, Aman Tandon wrote: > Hi, > > I need to

Re: Search results differs with sorting on pagination.

2015-09-10 Thread Modassar Ather
Upayavira! I add the fl=id,score,[shard] and saw the shards changing in the response every time and for different shards the response changes but for the same shard result is same on multiple hits. When I add secondary sort field e.g. score the shard remains same across hits. On Thu, Sep 10, 2015

Re: Stemmer and stopword Development

2015-09-10 Thread Imtiaz Shakil Siddique
Thanks for the reply. Currently I have 20GB Bengali newspaper data ( for corpus building ) I don't have manual stemmed corpus but if needed I will build one. Basically I need guidance regarding how to do this. If there are some standard approaches of building stemmer and stopword for use with

Re: How to reordering search result by some function query

2015-09-10 Thread Leonardo Foderaro
Hi Aman, if you want to sort/filter/boost on a custom function query please take a look at the Alba Framework, maybe it can be useful. For example, you can define a new function query (and then sort/filter/boost on it) simply by adding some annotations to your method, as explained in the wiki:

Broken highlight truncation for hl.alternateField

2015-09-10 Thread Thibaud Bioulac
Hello everybody, I have the exact same issue as Arcadius Ahouansou (see Broken highlight truncation for hl.alternateField ). To sum up : when

Re: Search results differs with sorting on pagination.

2015-09-10 Thread Upayavira
Add fl=id,score,[shard] to your query, and show us the results of two differing executions. Perhaps we will be able to see the cause of the difference. Upayavira On Thu, Sep 10, 2015, at 05:35 AM, Modassar Ather wrote: > Thanks Erick. There are no replicas on my cluster and the indexing is one

Re: Search results differs with sorting on pagination.

2015-09-10 Thread Modassar Ather
To add to my previous observation I saw the response having results from multiple shards when the secondary sort field is added and they remain same across hits. Kindly help me understand this behavior. Why the results are changing as I understand that the result should be first clubbed together

Can solr ttf functionQuery support ngram (n>2) ?

2015-09-10 Thread Jie Gao
Hi, I'm wondering whether solr ttf functionQuery support (compound words) ngram (n>2) ? I'm using " http://localhost:8983/solr/collection1/select?q=*:*=ttf(content,%22apple%20banana%22)=1" to query total term frequency of bigram tokens in "content" field in the whole index. However, the result

Re: How to reordering search result by some function query

2015-09-10 Thread Upayavira
Aman, If you are using edismax then what you have written is just fine. For Lucene query parser queries, wrap them with the boost query parser: q={!boost b=product_guideline_score v=$qq}=jute Note in your example you don't need product(), just do boost=product_guideline_score Upayavira On

Re: Stemmer and stopword Development

2015-09-10 Thread Upayavira
I haven't heard of any machine learning based stemmers. I'm not really sure what algorithm you would use to do stemming - what you'd be looking for is something that says, well, running stemmed to run, walking stemmed to walk, therefore hopping should stem to hop, but that'd be quite an algorithm

Re: Search results differs with sorting on pagination.

2015-09-10 Thread Modassar Ather
If two documents come back from different shards with the same score, the order would not be predictable This is fine. What I am not able to understand is that when I do not give a secondary field for sort I am getting the result from one shard which changes to other shard in other hits. Here

Re: Search results differs with sorting on pagination.

2015-09-10 Thread Upayavira
What scores are you getting? If two documents come back from different shards with the same score, the order would not be predictable - probably down to which shard responds first. Fix it with something like sort=score,timestamp or some other time related field. Upayavira On Thu, Sep 10, 2015,

Re: Detect term occurrences

2015-09-10 Thread Walter Underwood
Doing a query for each term should work well. Solr is fast for queries. Write a script. I assume you only need to do this once. Running all the queries will probably take less time than figuring out a different approach. wunder Walter Underwood wun...@wunderwood.org

RE: Detect term occurrences

2015-09-10 Thread Markus Jelsma
If you are interested in just the number of occurences of an indexed term. The TermsComponent will give that answer. MArkus -Original message- > From:Francisco Andrés Fernández > Sent: Thursday 10th September 2015 15:58 > To: solr-user@lucene.apache.org > Subject:

Re: Can solr ttf functionQuery support ngram (n>2) ?

2015-09-10 Thread Jie Gao
Please ignore for this question. No problem for the ttf functionQuery now. I did wrong for manually checking of the tf result. The row size should be set to more than default size (10) for phrase query " http://localhost:8983/solr/

Debugging Angular JS Application

2015-09-10 Thread Esther-Melaine Quansah
Hi, Is there a way for me to debug and modify Angular JS code in the Solr Admin UI without needing to completely rebuild the server and clearing browser cache? Thanks, Esther

Re: Detect term occurrences

2015-09-10 Thread Alexandre Rafalovitch
Can you tell us a bit more about the business case? Not the current technical one. Because it is entirely possible Solr can solve the higher level problem out of the box without you doing manual term comparisons.In which case, your problem scope is not quite right. Regards, Alex. Solr

Re: ghostly config issues

2015-09-10 Thread Mark Fenbers
On 9/7/2015 4:52 PM, Shawn Heisey wrote: The only files that should be in server/lib is jetty and servlet jars. The only files that should be in server/lib/ext is logging jars (slf4j, log4j, etc). In the server/lib directory on Solr 5.3.0: ext/ javax.servlet-api-3.1.0.jar

Detect term occurrences

2015-09-10 Thread Francisco Andrés Fernández
Hi all, I'm new to Solr. I want to detect all ocurrences of terms existing in a thesaurus into 1 or more documents. What´s the best strategy to make it? Doing a query for each term doesn't seem to be the best way. Many thanks, Francisco

Re: Search results differs with sorting on pagination.

2015-09-10 Thread Erick Erickson
First, if Upayavira's intuition is correct (and I'm guessing it is), then the behavior you're seeing is probably an accident of coding rather than intentional. I think the algorithm is something like this: Node1 gets the original query Node1 sends sub-queries out to each shard. As the results

Re: How to reordering search result by some function query

2015-09-10 Thread Aman Tandon
> > boost=product_guideline_score Thank you Upayavira. Leonardo, thanks for the suggestion. But I think boost parameter will work great for us. Thank you so much for your help. With Regards Aman Tandon On Thu, Sep 10, 2015 at 5:11 PM, Upayavira wrote: > Aman, > > If you

Re: Stemmer and stopword Development

2015-09-10 Thread Imtiaz Shakil Siddique
Hi Upayavira, Thank you for your kind assistance Sir. If that is the requirement for stemming then I will do it. My next question is how can I build a stopword list for Bengali language? The option that I've thought about are 1. Calculate idf values for all the stemmed words inside 20GB crawled

Re: Debugging Angular JS Application

2015-09-10 Thread Erik Hatcher
With the exploded structure, maybe we can move the webapp source underneath server/solr-webapp (and let the build just fill in the binary Java stuff, and avoid overwriting anything). Then we can keep the source in the same place as the “dist”, keeping it nice and DRY and easily

Re: error while running query on solr slave

2015-09-10 Thread shahper
Sorry for late reply. I am facing one more issue now. 1. When I am shutting down my master and start working with my slave. I am not able to fetch any data.As I can check data folder in my core its same as master. but then also I am not able to get and data when I run any query. "error":{

Re: Debugging Angular JS Application

2015-09-10 Thread Shawn Heisey
On 9/10/2015 9:03 AM, Esther-Melaine Quansah wrote: > Is there a way for me to debug and modify Angular JS code in the Solr Admin > UI without needing to completely rebuild the server and clearing browser > cache? I'm not sure about browser caching. That might be a problem, but if it is, it's

Re: How to secure Admin UI with Basic Auth in Solr 5.3.x

2015-09-10 Thread Imtiaz Shakil Siddique
If you are using Linux server you can always iptables to restrict access to solr admin panel. On Sep 9, 2015 3:05 PM, "Merlin Morgenstern" wrote: > I just installed solr cloud 5.3.x and found that the way to secure the amin > ui has changed. Aparently there is a new

Re: Debugging Angular JS Application

2015-09-10 Thread Upayavira
That would be fantastic, Erik. I've got a somewhat complex setup where I rsync between folders. Being able to serve directly from the SVN location would be very handy. Upayavira On Thu, Sep 10, 2015, at 04:58 PM, Erik Hatcher wrote: > With the exploded structure, maybe we can move the webapp

Re: Debugging Angular JS Application

2015-09-10 Thread Upayavira
On Thu, Sep 10, 2015, at 04:03 PM, Esther-Melaine Quansah wrote: > Hi, > > Is there a way for me to debug and modify Angular JS code in the Solr > Admin UI without needing to completely rebuild the server and clearing > browser cache? I just edit the files in server/solr-webapp/webapp, and

Re: How to secure Admin UI with Basic Auth in Solr 5.3.x

2015-09-10 Thread Noble Paul
Check this https://cwiki.apache.org/confluence/display/solr/Securing+Solr There a couple of bugs in 5.3.o and a bug fix release is coming up over the next few days. We don't provide any specific means to restrict access to admin UI itself. However we let users specify fine grained ACLs on

Issue while adding Long.MAX_VALUE to a TrieLong field

2015-09-10 Thread Pushkar Raste
Hi, I am trying to following add document (value for price.long is Long.MAX_VALUE) 411 one 9223372036854775807 However upon querying my collection value I get back for "price.long" is 9223372036854776000 Definition for 'price.long' field and 'long' look like

Re: Using join with edismax

2015-09-10 Thread Upayavira
On Thu, Sep 10, 2015, at 10:51 PM, Steven White wrote: > Hi everyone, > > Does any one know if "join" across cores supported with edismax? Why wouldn't it be? To unpack the question more though, edismax is a query parser, join is a query parser. You can certainly have an edismax query in the

Re: Issue while adding Long.MAX_VALUE to a TrieLong field

2015-09-10 Thread Pushkar Raste
Thank you Yonik, looks like I missed previous reply. This is seems logical as Max Long in java script is (2^53 - 1), which the max value I can insert and validate through Admin UI. Never though Admin UI itself would trick me though. On Thu, Sep 10, 2015 at 6:01 PM, Yonik Seeley

Issue while adding Long.MAX_VALUE to a TrieLong field

2015-09-10 Thread Pushkar Raste
I am trying following add document (value for price.long is Long.MAX_VALUE) 411 one 9223372036854775807 However upon querying my collection value I get back for "price.long" is 9223372036854776000 (I got same behavior when I used JSON file) Definition for

Re: Debugging Angular JS Application

2015-09-10 Thread Erik Hatcher
Upayavira, could you give this a try and see if this works (patch is for trunk): https://issues.apache.org/jira/browse/SOLR-8035 And when do we make the Angular UI the default? :) Erik > On Sep 10, 2015, at 12:26 PM, Upayavira

Using join with edismax

2015-09-10 Thread Steven White
Hi everyone, Does any one know if "join" across cores supported with edismax? Thanks!!! Steve,

Re: Issue while adding Long.MAX_VALUE to a TrieLong field

2015-09-10 Thread Yonik Seeley
On Thu, Sep 10, 2015 at 5:43 PM, Pushkar Raste wrote: Did you see my previous response to you today? http://markmail.org/message/wt6db4ocqmty5a42 Try querying a different way, like from the command line using curl, or from your browser, but not through the solr admin.

Re: Debugging Angular JS Application

2015-09-10 Thread Upayavira
On Thu, Sep 10, 2015, at 10:52 PM, Erik Hatcher wrote: > Upayavira, could you give this a try and see if this works (patch is for > trunk): https://issues.apache.org/jira/browse/SOLR-8035 > Will look :-) > And when do we make the Angular UI

Re: Detect term occurrences

2015-09-10 Thread Erick Erickson
_Assuming_ this isn't a high throughput _and_ the leaflet text isn't too big... Index the thesaurus and fire all the terms of the query in a big OR clause against the index as a _query_. Perhaps turn highlighting on and highlight the entire leaflet text. Note, this is just "off the top of my

Re: Issue Using Solr 5.3 Authentication and Authorization Plugins

2015-09-10 Thread Dan Davis
Kevin & Noble, I've manually verified the fix for SOLR-8000, but not yet for SOLR-8004. I reproduced the initial problem with reloading security.json after restarting both Solr and ZooKeeper. I verified using zkcli.sh that ZooKeeper does retain the changes to the file after using

Re: Search results differs with sorting on pagination.

2015-09-10 Thread Modassar Ather
Thanks Erick and Upayavira for the responses. One thing which I noticed in context of single sort field that the scores differ in each shard response. No score is identical in the response of one shard and they differ too in the responses from other shards. The score I got using fl=score.

Re: Detect term occurrences

2015-09-10 Thread Francisco Andrés Fernández
Yes. I have many drug products leaflets, each corresponding to 1 product. In the other hand we have a medical dictionary with about 10^5 terms. I want to detect all the occurrences of those terms for any leaflet document. Could you give me a clue about how is the best way to perform it? Perhaps,

Re: Issue Using Solr 5.3 Authentication and Authorization Plugins

2015-09-10 Thread Dan Davis
SOLR-8004 also appears to work to me. I manually edited security.json and did putfile. I didn't bother with browse permission, because it was Kevin's workaround.solr-5.3.1-SNAPSHOT did challenge me for credentials when going to curl

Loading Solr Analyzer from RuntimeLib Blob

2015-09-10 Thread Steve Davids
Accidentally sent this on the java-users list instead of solr-users... Hi, I am attempting to migrate our deployment process over to using the recently added "Blob Store API" which should simplify things a bit when it comes to cloud infrastructures for us. Unfortunately, after loading the jar

Re: Issue while adding Long.MAX_VALUE to a TrieLong field

2015-09-10 Thread Yonik Seeley
On Thu, Sep 10, 2015 at 2:21 PM, Pushkar Raste wrote: > Hi, > I am trying to following add document (value for price.long is > Long.MAX_VALUE) > > > 411 > one > 9223372036854775807 > > > However upon querying my collection value I get back

IOException occured when talking to server

2015-09-10 Thread ku3ia
Hi all! Sometimes, in logs is this ERROR: ERROR - 2015-09-10 11:52:19.940; org.apache.solr.common.SolrException; org.apache.solr.common.SolrException: org.apache.solr.client.solrj.SolrServerException: IOException occured when talking to server at: http://x.x.x.x:8080/solr/corename_shard1_replica1

Re: Stemmer and stopword Development

2015-09-10 Thread Upayavira
It depends on why you want stopwords. Stopwords were an important thing back in the day - they helped performance. Now, with a decent CPU and TF/IDF on your side, they don't do so much harm, in fact, avoiding them can save the day: q=to be or not to be would not locate anything if we'd used

Re: Stemmer and stopword Development

2015-09-10 Thread Doug Turnbull
I've used stopwords to reduce the index size considerably to improve search performance (same with stemming, etc). For relevance I've often preferred to leave stop words in for the reasons Upayavira mentions. There's all kinds of confusing things taht can happen with stopwords that sometimes

Re: Boosting related doubt?

2015-09-10 Thread Shawn Heisey
On 9/9/2015 11:16 PM, Aman Tandon wrote: > I need to ask that when i am looking for the all the parameters of the > query using the *echoParams=ALL*, I am getting the boost parameter twice in > the information printed on the browser screen. If you see a parameter twice in the "echoParams=all"

RE: Stemmer and stopword Development

2015-09-10 Thread Davis, Daniel (NIH/NLM) [C]
Stop words for international indexing seem not too useful to me at this point. To use them, you definitely have to know what language you are in at all times, and that doesn't happen with unstructured data (e.g. a bunch of PDF/Word files that happen to be linked from a bunch of web pages).

Re: How to reordering search result by some function query

2015-09-10 Thread Aman Tandon
Hi, I figured it out to implement the same. I will be doing this by using the boost parameter e.g. http://server:8112/solr/products/select?q=jute=title *=product(1,product_guideline_score)* If there is any other alternative then please suggest. With Regards Aman Tandon On Thu, Sep 10, 2015 at