Multi word query - Highlight should show all matched words instead of single word

2015-04-30 Thread Ap
I am using a multi word search query used for to find a match for text that 
user is typing. (Im not using Suggesters)

eg. If a doc has a field with the text "... ... ... .. The Moon is a 
part... ..  moon is abc ... .. universe is xyz... .. . galaxy 
in the universe ."
 The user types "*moo* *univers*", the response I need to send in 
suggestions should be *"moon universe" and not -"moon", " universe" which 
is what happens right now coz I am creating map of tokens.*

Currently I am using Highlighting and it returns me the below the matching 
snippet :

1. The *Moon* is a part
2. *moon* is abc
3. *universe* is xyz
4. galaxy in the *universe*


Thanks
 

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/ffc2f74c-e1d2-457f-a530-5886c27e5abc%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Completion Suggester -Search Analyzer Tokens not getting Union of Results

2015-04-13 Thread Ap
I have the following search_analyzer:

settings -->"whitespace_analyzer": {
"type": "custom",
"tokenizer": "whitespace",
"filter": [
"lowercase",
"asciifolding"
]
}
mapping -->
*"index_analyzer" : "whitespace_analyzer",*
*"search_analyzer" : "whitespace_analyzer",*


I have inserted a single doc "Calfornia World"
Following is the search result:

1. "Cali" --> matches correctly and returns the doc
2. "Wor" --> matches correctly and returns the doc
*3. "Cali Wor" --> DOES NOT matches correctly and doc is not returned. --> 
This is the problem. This text should return this doc.*

Does anyone know whats wrong ?

Thanks

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/2be8f2ba-2729-4959-ad24-eca3fd5b2a15%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Completion suggester get Exact matched token

2015-04-13 Thread Ap
How do you get the exact matched token from the completion suggester ?

eg.
  {
"id" : "ID1",
"name" : "AAA BBB  DDD EEE  GGG H   KKK LLL 999 
888",
"suggest" : {
"input": 
["AAA","BBB","","DDD","EEE","US","California","Framework","","","KKK","LLL","999","888"],
"output": **
"payload" : {
"id" : "ID1",
"name" : text
}
   }

So if a user types cali, the output that needs to be returned to the user 
is the matched token "California" instead of the complete string or id.

Thanks

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/7b04ed89-814d-48e4-8666-c4f24f45186d%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Completion suggester - Finite Strings Error

2015-04-08 Thread Ap
I get the following error when I try to insert a doc

*TransportError(500, u'IllegalArgumentException[TokenStream expanded to 
44800 finite strings. Only <= 256 finite strings are supported]')*


*Index: This gets created successfully*

index_body = {
"settings": {
"analysis": {
"filter": {
"nGram_filter": {
"type": "edge_ngram",
"min_gram": 2,
"max_gram": 20,
"token_chars": [
"letter",
"digit",
"punctuation",
"symbol"
]
}
},
"analyzer": {
"nGram_analyzer": {
"type": "custom",
"tokenizer": "whitespace",
"filter": [
"lowercase",
"asciifolding",
"nGram_filter"
]
},
"whitespace_analyzer": {
"type": "custom",
"tokenizer": "whitespace",
"filter": [
"lowercase",
"asciifolding"
]
},
"folding": {
"tokenizer": "standard",
"filter":  [
"lowercase",
"asciifolding"
]
}
}
}
},
"mappings": {
"funds" : {
"name": {
"type": "string",
"index": "not_analyzed"
},
"suggest" : {
"type" : "completion",
"index_analyzer" : "nGram_analyzer",
"search_analyzer" : "whitespace_analyzer",
"payloads" : "true"
}
}
}
}
}

*Doc to insert: *

*This gets inserted fine *
json3 = {
"name" : "ABC 123144",
"suggest" : {
"input": ["ABC 123144"],
"output": "ABC 123144",
"payload" : {
"id" : "ID123"
}
}
}

*This doc gives the above error*
*json2 = {*
*"name" : "ABC 123144 ASDASD ASFSDFADF GROUP ADSAD ADAFAFAF",*
*"suggest" : {*
*"input": ["ABC 123144 ASDASD ASFSDFADF GROUP ADSAD ADAFAFAF"],*
*"output": "ABC 123144 ASDASD ASFSDFADF GROUP ADSAD ADAFAFAF",*
*"payload" : {*
*"id" : "ID123"*
*}*
*}*
*}*

What are the possible solutions or alternatives for this error ?

Thanks

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/6d8f5d86-ef8c-4f10-9e71-7eeae363837d%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Suggester -Fields inside Payload vs regular fields

2015-04-06 Thread Ap
I need to create a Search Suggester on text and return an object with bunch 
of fields.

The object contains the following fields: 
userId, name, x,y,z,a,b,c

*Search needs to be done only on 2 fields - Name & Text.*

I also need to return the Object(with fields) if the docId is explicitly 
requested.
curl -XGET 'http://localhost:9200/twitter/tweet/1'


*Question* : Should everything in the Object (that needs to be returned) go 
in the payload only *(Option 2) *or payload+regular fields *(Option 1)* ?

*Option 1: *
json = {
"suggest" : {
"input": ["US"],
"output": "US",
  *  "payload" : {*
*"userId" : "OID123", "name" : "US", "text" : 
"California", ... < list of fields>*
*}*
}
}

*Option 2: *
json = {
*"userId" : "OID123", "name" : "US", "text" : "California", ... < 
list of fields>*
"suggest" : {
"input": ["abc"],
"output": "abc",
  *  "payload" : {*
*"userId" : "OID123", "name" : "US", "text" : 
"California", ... < list of fields>*
*}*
}
}

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/ca263ca6-b22f-4a60-bfef-a68c6c39d4a2%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Completion Suggester - Access payload from another Index

2015-04-06 Thread Ap
I have few indices A,B,C each having userId,name,etc.

I want to search text in A and if a hit is found, the payload should return 
the following:

1. userId, name of that doc in Index A
2. documentID/s of Index B,C

This behavior extends to the search in Index B and C as well.

*Step 1. How do I store this in the payload for each Index ?*

*The structure for payload looks like, eg.A*
*"payload" : {*
* "userId" : "A1",*
*"name" : a1",*
*"documentIDForB" : ,*
*"listDocumentIDForC" : []*
*}*

*Step 2. How do I retrieve the payload information directly from Index B,C 
? Is this treated as any other regular field when retrieving info from a 
doc in ES?*

Thanks


-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/bb597783-da87-4776-aa6f-2447771a9615%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Search Bar queries/functionalities

2015-02-20 Thread Ap
I am building a new search for following components:

1. Names of various entities ( objects, people, things)
2. Documents belonging to those entities
3. Connections of those entities to other entities

What queries/functionalities will be useful to be added in the search bar. 
How can they be executed ?

Some I can think of are :

1. Text match
2. Phrase match
3. Auto completion --> How to implement this ? (Indexing,querying, etc)
4. Text/Phrase correction --> How to implement this ? 
5. etc ( Suggestions welcome)

Thanks.

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/491f9922-aa7f-4968-9aa7-71880d704baa%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: Elastic Search - Sentiment analysis

2015-02-11 Thread Ap
Hi Jürgen,

Thanks for the reply. Can you tell me how can the analysis be done to 
identify that it was a relevant financial document, say based on terms or 
phrases ?

On Wednesday, February 11, 2015 at 2:03:20 AM UTC-8, Ap wrote:
>
> How do I identify relevant financial documents based on the 
> terms/phrases/sentiments present inside the Document ?  
>
> eg. Relevant document might contain--> Wall Street hits a new High.
>   Irrelevant document might contain --> I was walkng on Wall Street 
> and met an old friend.
>
>
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/52f8fe67-9077-4fb1-8953-c5efd7e8ae3f%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Elastic Search - Sentiment analysis

2015-02-11 Thread Ap
How do I identify relevant financial documents based on the 
terms/phrases/sentiments present inside the Document ?  

eg. Relevant document might contain--> Wall Street hits a new High.
  Irrelevant document might contain --> I was walkng on Wall Street and 
met an old friend.


-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/5001e852-42cc-4c0e-9266-1f0b3f8b088b%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Update by Query - Python/Java client

2015-02-09 Thread Ap
I need to update tags in docs based on query search. I am able to do it 
using the update by query plugin.

Is there a way to do it using Python or Java API client.

Here is the query that works perfectly fine. Search for a phrase and update 
a tag.

curl -XPOST 'localhost:9200/index/type/_update_by_query' -d '{
"query" : { "match_phrase" : { "content" : "michael aronstein" } },
"script" : "ctx._source.tags += tag",
"params" : {
"tag" : "ABC"
}
}'


Amay

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/b755711b-16e8-41e1-98b6-38f16b8b2031%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: Using Python/Java code for preprocessing documents

2015-02-05 Thread Ap
Hi Amit,

I am trying to do something similar. Did you find a solution to run the 
sentiment analysis at Insertion?

I posted my question here few mins back:
https://groups.google.com/forum/#!topic/elasticsearch/Nx3YlftE1sI

Amay

On Saturday, December 7, 2013 at 11:01:16 AM UTC-8, Amit Gupta wrote:
>
> It seems like a pretty easy question, but for some reason I still can't 
> understand how to solve the same. I have an elastic search cluster which is 
> using twitter river to download tweets. I would like to implement a 
> sentiment analysis module which takes each tweet and computes a score 
> (+ve/-ve) etc. I would like the score to be computed for each of the 
> existing tweets as well as for new tweets and then visualize using Kibana.
>
> However, I am not sure where should I place the call to this sentiment 
> analysis module in the elastic search pipeline.
>
> I have considered the option of modifying twitter river plugin but that 
> will not work retrospectively.
>
> Essentially, I need to answer two questions :- 1) how to call python/java 
> code while indexing a document so that I can modify the json accordingly. 
> 2) how to use the same code to modify all the existing documents in ES.
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/b697fea7-b034-4226-9d00-24b0412e96d0%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Creating Tags at Insertion in ES

2015-02-05 Thread Ap
I am trying to create a Tagging mechanism in Elastic search that checks for 
the text in the document and if the text if found then it tags the document 
with the corresponding text_id

Eg. Tagging for Company names, text = "Google Inc", text_id = "GOOG_123"

If a document has the text "Google Inc", then tag it with the id = 
"GOOG_123".

A. Insertion ?
 - Is there a way  in Elastic Search to achieve this at the time of 
Insertion ? 
 - If not, is there a way to tweak the Java code in Lucene to achieve 
this ?

B. Post processing
- If it is not possible to do at insertion, how can we better achieve 
this with post processing ? Running Fuzzy or Term search ?

Thanks,
Amay

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/eb6eb29a-b3e4-4604-bf5c-e9d68b519599%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


ES Connection object - Lifetime

2014-11-24 Thread Ap

How long does the connection object stay open in ES ?


Eg. For Python

es = elasticsearch.Elasticsearch([
{'host': 'localhost', 'port': 9200}
])

I have a web service that keeps running on production. Do I need to check 
if the connection is open after x time ? say 24 hrs ? 
How can I set it to run all the time ?


Thanks.

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/56a244a0-d805-4c1c-a7b6-fa6b118404aa%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


What is wrong with the below Filtered Query ?

2014-11-19 Thread Ap
Using filter to 1st filter source = xyz and then score documents based on 
name = abc

curl -XGET 'http://:9200//_search?pretty' -d  '{ 
"filtered" : { 
  "query": { "term": { "name : "abc" }}, 
  "filter": { "terms" : { "source" : ["xyz"] }}
}}'

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/30432a9d-070d-43da-928b-682dea75eda6%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Filter/Term Query on 2 different fields

2014-11-19 Thread Ap


1. I have a query that looks for specific Terms in multiple fields.

Eg. for 2 different fields below

1st level filter --> source = XYZ
2nd level filter --> name = John


The query needs to find all documents that have source = xyz and then 
source = John.

2. I also need to make a variation of the query that looks for below

1st level filter --> Source not equal to XYZ, ABC
2nd filter --> name = John


Thanks

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/9f2a4d14-42d7-45a4-af89-8a37a7e690ca%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Entity Matching in ES

2014-11-10 Thread Ap
How can we do Entity matching in ES ?

Eg. A Company name can have different variations. 

1. USA Tech Ltd
2. USA Tech LLC
3. USA Tech Asia Ltd

If the above data is present in ES in the Name field, and a 4th value = 
"USA Euro Tech Ltd" then it should identify that all the Names are same.

How can we do that in ES ?

Right now, I am trying to use Fuzzy on the complete data set (~100K to 1Mn 
docs) and getting the top 20 matches, loading them in memory and running an 
external Jaro Wrinkler library ( Java-Lucene) on the 20 matches.

Is there a way to directly do Entity matching on the fields in ES ?

Thanks.

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/39d8d73e-bfe0-491f-8c33-90bb0d5426b3%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Phrase Matching - Urgent

2014-11-06 Thread Ap
How can I use Phrase Match in ES.

eg. A Company Name field can have the following entries :

  - "USA Tech LLC"
 -  "USA Tech Ltd"
 -  " Asia USA Tech LLC"
 -  "Euro USA Tech"

1. I want to write a Java algorithm that will suggest all above 4 as same.

2. Also, how can I use Jaro Wrinkler to perform this action on a Phrase 
directly on data in ES.

Thanks

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/96c619f7-448c-41ff-8414-510991a9dc7a%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Phrase Matching using Java

2014-11-05 Thread Ap
I am trying to do a Phrase matching to find similar Phrases.

Eg. Name field has following entries and all 3 should be evaluated to same :

   1. "USA Tech Company" 
   2. "USA Tech Company Alabama"
   3. "USA Tech Company California"


Can you suggest a Java code that uses Phrase matcher or something similar 
to suggest that above entries in the Name field are same ( possibly higher 
score)

Thanks.

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/5e514e38-18ff-4bb3-824a-a1a3309195a5%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Term Suggestion Builder - In Java - Info & Example

2014-11-04 Thread Ap
I am exploring Term Suggestion in JAVA. Can someone help me with the 
following :

1. Java Code Example to suggest/find similar terms from a specific field in 
an Index.

 eg. Text Entered : money

  - Name field values : {abc, money, moneys, mone, xyz }

Possible suggestions with score should be money, moneys, mone.

2. For the above code, how can I make the Term Suggester use Jaro Wrinkler 
algorithm instead of Default. 

3. Uses/Explanation for different Term Suggest classes


Thanks. 

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/a31eee54-61f5-47d8-800a-36f40c4028ac%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Using Java, How to retrieve one field from all documents inside an Index.

2014-11-04 Thread Ap

Eg. Index has 100 documents with field A, B, C Z

I want to retrieve a list of all A's from the 100 documents in the Index.

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/b36f895c-8497-4167-be7b-3227763e6fea%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.