from:"Petr Janský"

Re: Operators NEARx, BEFOR, AFTER, FIRSTx, LASTx

2015-03-17 Thread Petr Janský

Noone? :-(

Petr

Dne středa 18. února 2015 12:35:15 UTC+1 Petr Janský napsal(a):

Hi Lukas,

thank you for your answer. I checked the Proximity Match -
match_phrase and it's what I looking for. I'm only not able to find a way
how to create queries like:

1. Obama BEFORE Iraq - the first word(not term) is before the second
in a field text
2. President Obama AFTER Iraq - the phrase President Obama is
after Iraq in a field text

In other words, the match_phrase doesn't have in_order parameter like
span_near and for span_near I have to use terms - have to run analyzer for
words befor.

Do you have any idea how to implement these queries?

Thanks
Petr

Dne pondělí 19. ledna 2015 10:23:21 UTC+1 Lukáš Vlček napsal(a):

Hi Petr,

let me try to address some of your questions:

ad 1) I am not sure I understand what you mean. If you want to use span
type of query then simply use it instead of query string query. Especially,
if you pass user input into the query then it is recommended NOT to use
query string query and you should consider using different query type (like
span query in your case).

ad 2) Not sure I fully understand but I can see match for some of those
requested features in span queries. Like slop. I would recommend you to
read through chapters of Proximity Matching [1] to see how you can use
slop.

ad 3) The input that goes into span queries can go through text analysis
process (as long as I am not mistaken). The fact that there are term
queries behind the scene does not mean you can not process your analysis
first.

May be if you can share some of your configs/documents/queries we can
help you more.

[1]
http://www.elasticsearch.org/guide/en/elasticsearch/guide/current/proximity-matching.html

Regards,
Lukas

On Mon, Jan 19, 2015 at 10:02 AM, Petr Janský petr@6hats.cz wrote:

Noone? :-(

Petr

Dne úterý 13. ledna 2015 15:37:18 UTC+1 Petr Janský napsal(a):

Hi there,

I'm looking for a way how to access span_near and span_first
functionality to users via search box in gui that uses query string query.

1. Is there any easy way how to do it?
2. Will ElasticSeach folks implement operators like NEARx, BEFOR,
AFTER, FIRSTx, LASTx to be able search by (using query string):
- specific max word distance between key words
- order of key words
- word position of key word in field from start and end of field
text
3. Span queries enable to use only terms, is there a way how to use
words that will be analysed by lang. analyser - stemming etc.?

Thanks
Petr

--
You received this message because you are subscribed to the Google
Groups elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send
an email to elasticsearc...@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/f90a0eba-1b61-4a23-a2af-ec6a0c5e461f%40googlegroups.com

https://groups.google.com/d/msgid/elasticsearch/f90a0eba-1b61-4a23-a2af-ec6a0c5e461f%40googlegroups.com?utm_medium=emailutm_source=footer
.

For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/b49027e5-949d-4e35-8907-80dec5137efe%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Re: Using shingle

2015-03-17 Thread Petr Janský

Noone? :-(

Petr

Dne pátek 20. února 2015 15:29:15 UTC+1 Petr Janský napsal(a):

 Hi there,

 I've tried to use shingle for getting bigrams and trigrams

 curl -X POST 'localhost:9200/idnes/' -d '{
   settings : {
 analysis : {
   filter: {
 czech_stop: {
   type:   stop,
   stopwords:  _czech_,
   ignore_case: true,
   remove_trailing: false
 },
 czech_stop_ngram: {
   type:   stop,
   stopwords : [a, i, k, o, s, u, v, z, do, 
 co, by, do, je, mu, mi, mě, mně, mne, na, ne, ní, 
 si, se, ta, to, té, ti, ty, už, ve, za, že, aby, 
 ani, ale, byl, jak, jen, jde, kdo, kdy, kde, něm, 
 nich,  něj, než, pro, tak, ten, tam, tady, těch, jsou, 
 jsem, není, nyní, nimi, jako, jaká, jaké, jaká, právě, 
 který, která, které, jeho, její, nebo, jako, toho, kdyby, 
 takový, taková, takové, _czech_ ],
   ignore_case: true,
   remove_trailing: false
 },
 czech_keywords: {
   type:   keyword_marker,
   keywords:   [že] 
 },
 czech_stemmer: {
   type:   stemmer,
   language:   czech
 },
 shingle2_filter: {
 type: shingle,
 min_shingle_size: 2, 
 max_shingle_size: 2, 
 output_unigrams:  false   
 },
 shingle3_filter: {
 type: shingle,
 min_shingle_size: 3, 
 max_shingle_size: 3, 
 *output_unigrams:  false   *
 }
   },
   analyzer: {
 
 shingle2s_analyzer: {
 type: custom,
 tokenizer: standard,
 filter: [standard, lowercase, czech_stop_ngram, 
 shingle2_filter]
 },
 shingle3s_analyzer: {
 type: custom,
 tokenizer: standard,
 filter: [czech_stop_ngram, shingle3_filter ]
 }
   }
 }
  },

   mappings : {
 article : {
 _id : {
 path : reference
 },

 properties : {
 .
 content2   : { type:string, analyzer: shingle2_analyzer},
 content3   : { type:string, analyzer: shingle3_analyzer},
 content4   : { type:string, analyzer: 
 shingle2s_analyzer},
 content5   : { type:string, analyzer: 
 shingle3s_analyzer},
 ..

 If I try my analysers using by calling:

 curl -X GET 
 'localhost:9200/idnes/_analyze?analyzer=shingle3s_analyzerpretty' -d 'a e 
 i o u s k z na ke ze nad pod za před Norská strana zatím dostatečně 
 nevyhodnotila, jak citlivou otázkou je pro Česko případ synů Evy 
 Michalákové. Tak popisuje současnou situaci premiér Bohuslav Sobotka. Ten 
 již dostal odpověď na dopis od premiérky Norska Erny Solbergové. S obecnými 
 odpověďmi není spokojen a zvažuje do Norska další psaní.' | grep token

 It works fine. In results there are only trigrams
tokens : [ {
 token : _ e _,
 token : e _ _,
 token : _ _ Norská,
 token : _ Norská _,
 token : Norská _ zatím,
 token : _ zatím dostatečně,
 token : zatím dostatečně nevyhodnotila,
 token : dostatečně nevyhodnotila _,
 token : nevyhodnotila _ citlivou,
 token : _ citlivou otázkou,
 token : citlivou otázkou _,
 token : otázkou _ _,
 

 But there is an issue if I use it on indexed data
 POST idnes/_search?pretty=true 
 {
 query: {
 match: {
content_type: Article
 }
 }, 
 facets : {
 tag : {
 terms : {
 fields : [content5],
 size : 20
 }
 }
 }
 }

 In the response there are also unigrams.
facets: {
   tag: {
  _type: terms,
  missing: 452,
  total: 926077,
  other: 762645,
  terms: [
 {
term: a,
count: 18150
 },
 {
term: to,
count: 17131
 },
 {
term: je,
count: 14090
 },
 {
term: se,
count: 13621
 },
 {
term: na,
count: 12285
 },
 ..
 {
term: korun _ _,
count: 551
 },
 {
term: _ _ případě,
count: 499
 },
 {
term: zobrazení videa musíte,
count: 449
 }
 .


1. Why does it happen?
2. Is there any other way how to skip _ from stopword than 
http://www.elasticsearch.org/blog/searching-with-shingles/ that 
doesn't work for Lucene 4.4+?

 Thanks
 Petr



-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch

Using shingle

2015-02-20 Thread Petr Janský

Hi there,

I've tried to use shingle for getting bigrams and trigrams

curl -X POST 'localhost:9200/idnes/' -d '{
  settings : {
analysis : {
  filter: {
czech_stop: {
  type:   stop,
  stopwords:  _czech_,
  ignore_case: true,
  remove_trailing: false
},
czech_stop_ngram: {
  type:   stop,
  stopwords : [a, i, k, o, s, u, v, z, do, 
co, by, do, je, mu, mi, mě, mně, mne, na, ne, ní, 
si, se, ta, to, té, ti, ty, už, ve, za, že, aby, 
ani, ale, byl, jak, jen, jde, kdo, kdy, kde, něm, 
nich,  něj, než, pro, tak, ten, tam, tady, těch, jsou, 
jsem, není, nyní, nimi, jako, jaká, jaké, jaká, právě, 
který, která, které, jeho, její, nebo, jako, toho, kdyby, 
takový, taková, takové, _czech_ ],
  ignore_case: true,
  remove_trailing: false
},
czech_keywords: {
  type:   keyword_marker,
  keywords:   [že] 
},
czech_stemmer: {
  type:   stemmer,
  language:   czech
},
shingle2_filter: {
type: shingle,
min_shingle_size: 2, 
max_shingle_size: 2, 
output_unigrams:  false   
},
shingle3_filter: {
type: shingle,
min_shingle_size: 3, 
max_shingle_size: 3, 
*output_unigrams:  false   *
}
  },
  analyzer: {

shingle2s_analyzer: {
type: custom,
tokenizer: standard,
filter: [standard, lowercase, czech_stop_ngram, 
shingle2_filter]
},
shingle3s_analyzer: {
type: custom,
tokenizer: standard,
filter: [czech_stop_ngram, shingle3_filter ]
}
  }
}
 },

  mappings : {
article : {
_id : {
path : reference
},

properties : {
.
content2   : { type:string, analyzer: shingle2_analyzer},
content3   : { type:string, analyzer: shingle3_analyzer},
content4   : { type:string, analyzer: shingle2s_analyzer},
content5   : { type:string, analyzer: shingle3s_analyzer},
..

If I try my analysers using by calling:

curl -X GET 
'localhost:9200/idnes/_analyze?analyzer=shingle3s_analyzerpretty' -d 'a e 
i o u s k z na ke ze nad pod za před Norská strana zatím dostatečně 
nevyhodnotila, jak citlivou otázkou je pro Česko případ synů Evy 
Michalákové. Tak popisuje současnou situaci premiér Bohuslav Sobotka. Ten 
již dostal odpověď na dopis od premiérky Norska Erny Solbergové. S obecnými 
odpověďmi není spokojen a zvažuje do Norska další psaní.' | grep token

It works fine. In results there are only trigrams
   tokens : [ {
token : _ e _,
token : e _ _,
token : _ _ Norská,
token : _ Norská _,
token : Norská _ zatím,
token : _ zatím dostatečně,
token : zatím dostatečně nevyhodnotila,
token : dostatečně nevyhodnotila _,
token : nevyhodnotila _ citlivou,
token : _ citlivou otázkou,
token : citlivou otázkou _,
token : otázkou _ _,


But there is an issue if I use it on indexed data
POST idnes/_search?pretty=true 
{
query: {
match: {
   content_type: Article
}
}, 
facets : {
tag : {
terms : {
fields : [content5],
size : 20
}
}
}
}

In the response there are also unigrams.
   facets: {
  tag: {
 _type: terms,
 missing: 452,
 total: 926077,
 other: 762645,
 terms: [
{
   term: a,
   count: 18150
},
{
   term: to,
   count: 17131
},
{
   term: je,
   count: 14090
},
{
   term: se,
   count: 13621
},
{
   term: na,
   count: 12285
},
..
{
   term: korun _ _,
   count: 551
},
{
   term: _ _ případě,
   count: 499
},
{
   term: zobrazení videa musíte,
   count: 449
}
.


   1. Why does it happen?
   2. Is there any other way how to skip _ from stopword than 
http://www.elasticsearch.org/blog/searching-with-shingles/ 
   that doesn't work for Lucene 4.4+?

Thanks
Petr

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/0d2aa0fb-2a12-404d-bdf4-bb09b970cb5c%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Re: Operators NEARx, BEFOR, AFTER, FIRSTx, LASTx

2015-02-18 Thread Petr Janský

Hi Lukas,

thank you for your answer. I checked the Proximity Match - match_phrase
and it's what I looking for. I'm only not able to find a way how to create
queries like:

1. Obama BEFORE Iraq - the first word(not term) is before the second in
a field text
2. President Obama AFTER Iraq - the phrase President Obama is after
Iraq in a field text

In other words, the match_phrase doesn't have in_order parameter like
span_near and for span_near I have to use terms - have to run analyzer for
words befor.

Do you have any idea how to implement these queries?

Thanks
Petr

Dne pondělí 19. ledna 2015 10:23:21 UTC+1 Lukáš Vlček napsal(a):

Hi Petr,

let me try to address some of your questions:

May be if you can share some of your configs/documents/queries we can help
you more.

[1]
http://www.elasticsearch.org/guide/en/elasticsearch/guide/current/proximity-matching.html

Regards,
Lukas

On Mon, Jan 19, 2015 at 10:02 AM, Petr Janský petr@6hats.cz
javascript: wrote:

Noone? :-(

Petr

Dne úterý 13. ledna 2015 15:37:18 UTC+1 Petr Janský napsal(a):

Hi there,

I'm looking for a way how to access span_near and span_first
functionality to users via search box in gui that uses query string query.

Thanks
Petr

--
You received this message because you are subscribed to the Google Groups
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearc...@googlegroups.com javascript:.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/f90a0eba-1b61-4a23-a2af-ec6a0c5e461f%40googlegroups.com

https://groups.google.com/d/msgid/elasticsearch/f90a0eba-1b61-4a23-a2af-ec6a0c5e461f%40googlegroups.com?utm_medium=emailutm_source=footer
.

For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/8e88fb60-0e1c-423e-8208-a5e01206c620%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Re: Operators NEARx, BEFOR, AFTER, FIRSTx, LASTx

2015-01-19 Thread Petr Janský

Noone? :-(

Petr

Dne úterý 13. ledna 2015 15:37:18 UTC+1 Petr Janský napsal(a):

 Hi there,

 I'm looking for a way how to access span_near and span_first functionality 
 to users via search box in gui that uses query string query.

1. Is there any easy way how to do it?
2. Will ElasticSeach folks implement operators like NEARx, BEFOR, 
AFTER, FIRSTx, LASTx to be able search by (using query string):
   - specific max word distance between key words
   - order of key words
   - word position of key word in field from start and end of field 
   text
3. Span queries enable to use only terms, is there a way how to use 
words that will be analysed by lang. analyser - stemming etc.?


 Thanks
 Petr


-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/f90a0eba-1b61-4a23-a2af-ec6a0c5e461f%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Operators NEARx, BEFOR, AFTER, FIRSTx, LASTx

2015-01-13 Thread Petr Janský

Hi there,

I'm looking for a way how to access span_near and span_first functionality 
to users via search box in gui that uses query string query.

   1. Is there any easy way how to do it?
   2. Will ElasticSeach folks implement operators like NEARx, BEFOR, AFTER, 
   FIRSTx, LASTx to be able search by (using query string):
  - specific max word distance between key words
  - order of key words
  - word position of key word in field from start and end of field text
   3. Span queries enable to use only terms, is there a way how to use 
   words that will be analysed by lang. analyser - stemming etc.?


Thanks
Petr

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/15d8acf6-83dd-4a11-bc7f-0377c2628035%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Get X word before and after search word

2014-06-17 Thread Petr Janský

Hello,

I'm trying to find way how to get words/terms around search word eg let's 
have a document with text The best search engine is ElasticSearch. I will 
search for best and get info that word search is xtime the next one 
after search words.

Thx
Petr

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/4a1b4896-6263-4de2-ad45-dc5efd4df7a3%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Query join cross types

2014-05-27 Thread Petr Janský

Hello guys,

I'm trying to find a way how to solve this case with two types:

product : {
properties : {
kbi_id : { type : string, index : not_analyzed},
agreement_status : { type : string, index : 
not_analyzed},   *- values Y/N*
product_code : { type : string, index : not_analyzed},
date_valid : { type : string, format : dd.MM.}
}}
~120M records


note : {
properties : {
note : { type : string, analyzer : cestina_hunspell},
kbi_id : { type : string, index : not_analyzed},
created_date : {type : date, format : dd.MM. HH:mm:ss},
}},
~3M records

The kbi_id field is clientID - join field. I don't have clients in my index.

I would like to get all note records for witch exist 1+ product record with 
agreement_status=Y and simultaneously 1+ product record with 
agreement_status=N.

Any ideas?

Thx
Petr

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/4b217624-f542-4f7d-8493-fc7a93932e65%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

elasticsearch-carrot2 JAVA API call

2014-03-05 Thread Petr Janský

Hello,

I use 
elasticsearch-carrot2https://github.com/carrot2/elasticsearch-carrot2and I 
was not able to find how to get cluster result using JAVA API e.g. for

GET _search_with_clusters?pretty=true
{
  query : {
   bool: {
  must: [
  { term: { content: zeman}}
]
   }
  },
  query_hint: vodafone,
  field_mapping: {
url:  [_source.reference],
title:[_source.title],
content:  [_source.content]
  }
}


result

{
   took: 12,
   timed_out: false,
   _shards: {
  total: 49,
  successful: 49,
  failed: 0
   },
   hits: {
  total: 2080562,
  max_score: 1,
  hits: [.]
   },
   clusters: [
  {
 id: 0,
 score: 4.094963866819582,
 label: Možné Scénáře,
 phrases: [
Možné Scénáře
 ],
 documents: [

http://www.denik.cz/z_domova/palestinsky-velvyslanec-zemrel-v-nemocnici-na-misto-jede-policejni-sef-cervicek.html#2114206;,

http://www.denik.cz/z_domova/palestinsky-velvyslanec-zemrel-v-nemocnici-na-misto-jede-policejni-sef-cervicek.html#2114186;,

http://www.denik.cz/z_domova/palestinsky-velvyslanec-zemrel-v-nemocnici-na-misto-jede-policejni-sef-cervicek.html#2114174;
 ]
  },
 
   ],
   info: {
  algorithm: lingo,
  search-millis: 11,
  clustering-millis: 26,
  total-millis: 38,
  include-hits: true
   }
}


is there any way how call this query using JAVA API?

Thx
Petr

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/7a23e5a5-15ed-40be-80d9-fa93eebc28fe%40googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Re: elasticsearch-carrot2 JAVA API call

2014-03-05 Thread Petr Janský

Hi Dawid,

thank you for your reply. I tried the sample you sent me
https://github.com/carrot2/elasticsearch-carrot2/blob/master/src/test/java/org/carrot2/elasticsearch/ClusteringActionTests.java#L65
 

but I got
Exception in thread main 
org.elasticsearch.common.util.concurrent.UncategorizedExecutionException: 
Failed execution
at 
org.elasticsearch.action.support.AdapterActionFuture.rethrowExecutionException(AdapterActionFuture.java:88)
at 
org.elasticsearch.action.support.AdapterActionFuture.actionGet(AdapterActionFuture.java:49)
at 
es.search.ClusteringActionTests.testAttributes(ClusteringActionTests.java:85)
at es.search.ClusteringActionTests.main(ClusteringActionTests.java:61)
Caused by: java.lang.ClassCastException: java.lang.Boolean cannot be cast 
to java.util.Map
at 
org.elasticsearch.common.io.stream.StreamInput.readMap(StreamInput.java:319)
at 
org.carrot2.elasticsearch.ClusteringAction$ClusteringActionRequest.readFrom(ClusteringAction.java:407)
at 
org.elasticsearch.transport.netty.MessageChannelHandler.handleRequest(MessageChannelHandler.java:209)
at 
org.elasticsearch.transport.netty.MessageChannelHandler.messageReceived(MessageChannelHandler.java:109)
at 
org.elasticsearch.common.netty.channel.SimpleChannelUpstreamHandler.handleUpstream(SimpleChannelUpstreamHandler.java:70)
at 
org.elasticsearch.common.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:564)
at 
org.elasticsearch.common.netty.channel.DefaultChannelPipeline$DefaultChannelHandlerContext.sendUpstream(DefaultChannelPipeline.java:791)
at 
org.elasticsearch.common.netty.channel.Channels.fireMessageReceived(Channels.java:296)
at 
org.elasticsearch.common.netty.handler.codec.frame.FrameDecoder.unfoldAndFireMessageReceived(FrameDecoder.java:462)
at 
org.elasticsearch.common.netty.handler.codec.frame.FrameDecoder.callDecode(FrameDecoder.java:443)
at 
org.elasticsearch.common.netty.handler.codec.frame.FrameDecoder.messageReceived(FrameDecoder.java:303)
at 
org.elasticsearch.common.netty.channel.SimpleChannelUpstreamHandler.handleUpstream(SimpleChannelUpstreamHandler.java:70)
at 
org.elasticsearch.common.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:564)
at 
org.elasticsearch.common.netty.channel.DefaultChannelPipeline$DefaultChannelHandlerContext.sendUpstream(DefaultChannelPipeline.java:791)
at 
org.elasticsearch.common.netty.OpenChannelsHandler.handleUpstream(OpenChannelsHandler.java:74)
at 
org.elasticsearch.common.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:564)
at 
org.elasticsearch.common.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:559)
at 
org.elasticsearch.common.netty.channel.Channels.fireMessageReceived(Channels.java:268)
at 
org.elasticsearch.common.netty.channel.Channels.fireMessageReceived(Channels.java:255)
at 
org.elasticsearch.common.netty.channel.socket.nio.NioWorker.read(NioWorker.java:88)
at 
org.elasticsearch.common.netty.channel.socket.nio.AbstractNioWorker.process(AbstractNioWorker.java:108)
at 
org.elasticsearch.common.netty.channel.socket.nio.AbstractNioSelector.run(AbstractNioSelector.java:318)
at 
org.elasticsearch.common.netty.channel.socket.nio.AbstractNioWorker.run(AbstractNioWorker.java:89)
at 
org.elasticsearch.common.netty.channel.socket.nio.NioWorker.run(NioWorker.java:178)
at 
org.elasticsearch.common.netty.util.ThreadRenamingRunnable.run(ThreadRenamingRunnable.java:108)
at 
org.elasticsearch.common.netty.util.internal.DeadLockProofWorker$1.run(DeadLockProofWorker.java:42)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:744)

I use
Elasticsearch 0.9.10
elasticsearch-carrot2 1.2.2
used jars from elasticsearch-carrot2-1.2.0.zip

thx
Petr

Dne středa, 5. března 2014 13:03:01 UTC+1 Dawid Weiss napsal(a):

 In short: yes. Perhaps the best way to check out how to use Java API 
 is to look at unit tests. For example: 


 https://github.com/carrot2/elasticsearch-carrot2/blob/master/src/test/java/org/carrot2/elasticsearch/ClusteringActionTests.java#L65
  

 I would also use a language code field and mapping for Czech; this 
 should improve your clustering results (especially if you tweak the 
 default lexical resources in Carrot2... patches welcome :). 


 https://github.com/carrot2/elasticsearch-carrot2/blob/master/src/test/java/org/carrot2/elasticsearch/ClusteringActionTests.java#L100
  

 Dawid 

 On Wed, Mar 5, 2014 at 10:53 AM, Petr Janský petr@6hats.czjavascript: 
 wrote: 
  Hello, 
  
  I use elasticsearch-carrot2 and I was not able to find how to get 
 cluster 
  result using JAVA API e.g. for 
  
  GET _search_with_clusters?pretty=true 
  { 
query : { 
 bool: { 
must: [ 
{ term: { content: zeman}} 
  ] 
 } 
}, 
query_hint: vodafone, 
field_mapping: { 
  url

Expanding terms

2014-02-25 Thread Petr Janský

Hello,

I'trying to find a way how to:

   1. expand a term - get all words and count that are relevant for a 
   term(s)
   2. get relevant words for a query - list of all words that 
   are highlighted
   3. get phrases by word - e.g. for word war = world war, second word 
   war, the second word war, 

and complicated one

   1. is there a way how to get/highlighted only words that are relevant 
   for multiple term conditions e.g. 

must: {
  wildcard: {
 content_morfo: {
   value: v*
  }
},
   wildcard: {
 content_morfo: {
   value: ==AA*==
  }
}
}

thx
Petr
  

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/f2c0eeb5-8da3-4554-bdac-d6b5122f01c1%40googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Re: Operators NEARx, BEFOR, AFTER, FIRSTx, LASTx

Re: Using shingle

Using shingle

Re: Operators NEARx, BEFOR, AFTER, FIRSTx, LASTx

Re: Operators NEARx, BEFOR, AFTER, FIRSTx, LASTx

Operators NEARx, BEFOR, AFTER, FIRSTx, LASTx

Get X word before and after search word

Query join cross types

elasticsearch-carrot2 JAVA API call

Re: elasticsearch-carrot2 JAVA API call

Expanding terms

11 matches

Site Navigation

Mail list logo

Footer information